Correct data structure for this problem

**buran** · Oct-09-2020, 11:09 AM

This is the third time I start to write this (due to our problems with the site) and lost 2 long drafts, so now I am pissed off and this time my post will be as short as possible.
This is X12 EDI format, HIPAA 835 file to be precise. Don't know why you were reluctant to say so from the start or when I asked.
I looked for specifications online, but it's hard to obtain one free. There are different companion guides available, but they are not exhaustive and at the same time - company specific. I found this one most useful: https://passporthealthplan.com/wp-conten...-guide.pdf
It's still outdated, e.g. CLP segment they show has only 6 elements, while you have more elements in CLP segment.
I am sure you know all this, but I say it for the benefit of the others.
I also found sample file here: https://www.emedny.org/HIPAA/5010/5010_s...index.aspx and downloaded 835 Sample (Professional Claims Only- With Payment) file and saved it as sample835.txt
Now I will work with it.

Output:
ISA*00*          *00*          *ZZ*EMEDNYBAT      *ZZ*ETIN           *100101*1000*^*00501*006000600*0*T*:~GS*HP*EMEDNYBAT*ETIN*20100101*1050*6000600*X*005010X221A1~ST*835*1740~BPR*I*45.75*C*ACH*CCP*01*111*DA*33*1234567890**01*111*DA*22*20100101~TRN*1*10100000000*1000000000~REF*EV*ETIN~DTM*405*20100101~N1*PR*NYSDOH~N3*OFFICE OF HEALTH INSURANCE PROGRAMS*CORNING TOWER, EMPIRE STATE PLAZA~N4*ALBANY*NY*122370080~PER*BL*PROVIDER SERVICES*TE*8003439000*UR*www.emedny.org~N1*PE*MAJOR MEDICAL PROVIDER*XX*9999999995~REF*TJ*000000000~LX*1~CLP*PATIENT ACCOUNT NUMBER*1*34.25*34.25**MC*1000210000000030*11~NM1*QC*1*SUBMITTED LAST*SUBMITTED FIRST****MI*LL99999L~NM1*74*1*CORRECTED LAST*CORRECTED FIRST~REF*EA*PATIENT ACCOUNT NUMBER~DTM*232*20100101~DTM*233*20100101~AMT*AU*34.25~SVC*HC:V2020:RB*6*6**1~DTM*472*20100101~AMT*B6*6~SVC*HC:V2700:RB*2.75*2.75**1~DTM*472*20100101~AMT*B6*2.75~SVC*HC:V2103:RB*5.5*5.5**1~DTM*472*20100101~AMT*B6*5.5~SVC*HC:S0580*20*20**2~DTM*472*20100101~AMT*B6*20~CLP*PATIENT ACCOUNT NUMBER*2*34*0**MC*1000220000000020*11~NM1*QC*1*SUBMITTED LAST*SUBMITTED FIRST****MI*LL88888L~NM1*74*1*CORRECTED LAST*CORRECTED FIRST~REF*EA*PATIENT ACCOUNT NUMBER~DTM*232*20100101~DTM*233*20100101~SVC*HC:V2020*12*0**0~DTM*472*20100101~CAS*CO*29*12~SVC*HC:V2103*22*0**0~DTM*472*20100101~CAS*CO*29*22~CLP*PATIENT ACCOUNT NUMBER*2*34.25*11.5**MC*1000230000000020*11~NM1*QC*1*SUBMITTED LAST*SUBMITTED FIRST****MI*LL77777L~NM1*74*1*CORRECTED LAST*CORRECTED FIRST~REF*EA*PATIENT ACCOUNT NUMBER~DTM*232*20100101~DTM*233*20100101~AMT*AU*11.5~SVC*HC:V2020:RB*6*6**1~DTM*472*20100101~AMT*B6*6~SVC*HC:V2103:RB*5.5*5.5**1~DTM*472*20130917~AMT*B6*5.5~SVC*HC:V2700:RB*2.75*0**0~DTM*472*20100101~CAS*CO*251*2.75~LQ*HE*N206~SVC*HC:S0580*20*0**0~DTM*472*20100101~CAS*CO*251*20~LQ*HE*N206~SE*65*1740~GE*1*6000600~IEA*1*006000600~

My point is you will have deeply nested structure File->Interchange(s) -> Functional group -> Transaction set(s) -> Loop(s) (I may be wrong for some of these, but anyway) and at each nested level you can have either some built-in container like list, dict, tuple, namedtuple etc. or write own class.
What will you choose depends on you - what you plan to do, do you want to validate data, do you plan to expand and so on.

For start very basic example

import pprint
line_sep = '~'
element_sep = '*'
with open(r'.\835\sample835.txt') as f:
    x12 = f.read()

x12 = x12.split(line_sep)
message = []
for segment in x12:
    if segment.startswith('ISA'):
        isa = {} # create empty dict
        isa['ISA'] = segment.split(element_sep)
        isa['payments'] = []
    elif segment.startswith('CLP'):
        payment = segment.split(element_sep)
        isa['payments'].append(payment)
    elif segment.startswith('IEA'):
        message.append(isa)
pprint.pprint(message)

Output:[{'ISA': ['ISA',
          '00',
          '          ',
          '00',
          '          ',
          'ZZ',
          'EMEDNYBAT      ',
          'ZZ',
          'ETIN           ',
          '100101',
          '1000',
          '^',
          '00501',
          '006000600',
          '0',
          'T',
          ':'],
  'payments': [['CLP',
                'PATIENT ACCOUNT NUMBER',
                '1',
                '34.25',
                '34.25',
                '',
                'MC',
                '1000210000000030',
                '11'],
               ['CLP',
                'PATIENT ACCOUNT NUMBER',
                '2',
                '34',
                '0',
                '',
                'MC',
                '1000220000000020',
                '11'],
               ['CLP',
                'PATIENT ACCOUNT NUMBER',
                '2',
                '34.25',
                '11.5',
                '',
                'MC',
                '1000230000000020',
                '11']]}]

As you can see - list (to allow multiple interchange blocks), each interchange will be a dict, the value for key "payments" is again dict list, holding multiple lists, etc.
I work with just ISA and CLP segments, but I guess you will need to work on other segments/loops too.

From here you can expand, e.g. replace lists for each segment with namedtuple

from collections import namedtuple
import pprint

ISA = namedtuple('ISA', ['identifier', 'authorization_information_qualifier', 'authorization_information', 
                         'security_information_qualifier', 'security_information', 'interchange_id_qualifier_isa5',
                        'interchange_sender_id', 'interchange_id_qualifier_isa7', 'interchange_receiver_id',
                        'interchange_date', 'interchange_time', 'interchange_control_standards',
                        'interchange_control_version_number', 'interchange_control_number',
                        'acknowledgement_requested', 'usage_indicator', 'component_element_separator'],
                        defaults=(None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, '>'))

CLP =namedtuple('CLP', ['identifier', 'patient_control_number', 'claim_status_code', 'total_claim_charge_amount',
                        'claim_payment_amount', 'claim_filing_indicator_code_', 'payer_claim_control_number', 'clp07', 'clp08'])


line_sep = '~'
element_sep = '*'
with open(r'.\835\sample835.txt') as f:
    x12 = f.read()

x12 = x12.split(line_sep)

message = []
for segment in x12:
    if segment.startswith('ISA'):
        isa = {} # create empty dict
        isa['ISA'] = ISA(*segment.split(element_sep))
        isa['payments'] = []
    elif segment.startswith('CLP'):
        payment = CLP(*segment.split(element_sep))
        isa['payments'].append(payment)
    elif segment.startswith('IEA'):
        message.append(isa)

pprint.pprint(message)
print('\n')
for isa in message:
    for payment in isa['payments']:
        print(f'Claim payment amount: {payment.claim_payment_amount}')

Output:[{'ISA': ISA(identifier='ISA', authorization_information_qualifier='00', authorization_information='          ', security_information_qualifier='00', security_information='          ', interchange_id_qualifier_isa5='ZZ', interchange_sender_id='EMEDNYBAT      ', interchange_id_qualifier_isa7='ZZ', interchange_receiver_id='ETIN           ', interchange_date='100101', interchange_time='1000', interchange_control_standards='^', interchange_control_version_number='00501', interchange_control_number='006000600', acknowledgement_requested='0', usage_indicator='T', component_element_separator=':'),
  'payments': [CLP(identifier='CLP', patient_control_number='PATIENT ACCOUNT NUMBER', claim_status_code='1', total_claim_charge_amount='34.25', claim_payment_amount='34.25', claim_filing_indicator_code_='', payer_claim_control_number='MC', clp07='1000210000000030', clp08='11'),
               CLP(identifier='CLP', patient_control_number='PATIENT ACCOUNT NUMBER', claim_status_code='2', total_claim_charge_amount='34', claim_payment_amount='0', claim_filing_indicator_code_='', payer_claim_control_number='MC', clp07='1000220000000020', clp08='11'),
               CLP(identifier='CLP', patient_control_number='PATIENT ACCOUNT NUMBER', claim_status_code='2', total_claim_charge_amount='34.25', claim_payment_amount='11.5', claim_filing_indicator_code_='', payer_claim_control_number='MC', clp07='1000230000000020', clp08='11')]}]


Claim payment amount: 34.25
Claim payment amount: 0
Claim payment amount: 11.5

In addition to above, which is my code, I found this https://hiplab.mc.vanderbilt.edu/git/lab/parse-edi
It's not great in terms of quality of python code, easy of installation, etc. I tried to run it but was not very successful with the sample file. Anyway - it may be useful and give you some additional hints if you decide to look at it further.

That's it for now. I apologise if it happened to use incorrect terminology here and there.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How can I add certain elements in this 2d data structure and calculate a mean	TheOddCircle	3	1,586	May-27-2022, 09:09 AM Last Post: paul18fr
	Looking for data/info on a perticular data-proccesing problem.	MvGulik	9	3,937	May-01-2021, 07:43 AM Last Post: MvGulik
	Appropriate data-structure / design for business-day relations (week/month-wise)	sx999	2	2,826	Apr-23-2021, 08:09 AM Last Post: sx999
	what data structure to use?	Winfried	4	2,857	Mar-16-2021, 12:11 PM Last Post: buran
	Yahoo_fin, Pandas: how to convert data table structure in csv file	detlefschmitt	14	7,863	Feb-15-2021, 12:58 PM Last Post: detlefschmitt
	How to use Bunch data structure	moish	2	2,943	Dec-24-2020, 06:25 PM Last Post: deanhystad
	difficulties to chage json data structure using json module in python	Sibdar	1	2,104	Apr-03-2020, 06:47 PM Last Post: micseydel
	File system representation in a data structure	Alfalfa	1	2,093	Dec-18-2019, 01:56 AM Last Post: Alfalfa
	Custom data structure	icm63	2	2,562	Mar-27-2019, 02:40 AM Last Post: icm63
	Nested Data structure question	arjunfen	7	4,299	Feb-22-2019, 02:18 PM Last Post: snippsat

Correct data structure for this problem

User Panel Messages

Announcements