This is the third time I start to write this (due to our problems with the site) and lost 2 long drafts, so now I am pissed off and this time my post will be as short as possible.
This is X12 EDI format, HIPAA 835 file to be precise. Don't know why you were reluctant to say so from the start or when I asked.
I looked for specifications online, but it's hard to obtain one free. There are different companion guides available, but they are not exhaustive and at the same time - company specific. I found this one most useful:
https://passporthealthplan.com/wp-conten...-guide.pdf
It's still outdated, e.g. CLP segment they show has only 6 elements, while you have more elements in CLP segment.
I am sure you know all this, but I say it for the benefit of the others.
I also found sample file here:
https://www.emedny.org/HIPAA/5010/5010_s...index.aspx and downloaded
835 Sample (Professional Claims Only- With Payment)
file and saved it as sample835.txt
Now I will work with it.
Output:
ISA*00* *00* *ZZ*EMEDNYBAT *ZZ*ETIN *100101*1000*^*00501*006000600*0*T*:~GS*HP*EMEDNYBAT*ETIN*20100101*1050*6000600*X*005010X221A1~ST*835*1740~BPR*I*45.75*C*ACH*CCP*01*111*DA*33*1234567890**01*111*DA*22*20100101~TRN*1*10100000000*1000000000~REF*EV*ETIN~DTM*405*20100101~N1*PR*NYSDOH~N3*OFFICE OF HEALTH INSURANCE PROGRAMS*CORNING TOWER, EMPIRE STATE PLAZA~N4*ALBANY*NY*122370080~PER*BL*PROVIDER SERVICES*TE*8003439000*UR*www.emedny.org~N1*PE*MAJOR MEDICAL PROVIDER*XX*9999999995~REF*TJ*000000000~LX*1~CLP*PATIENT ACCOUNT NUMBER*1*34.25*34.25**MC*1000210000000030*11~NM1*QC*1*SUBMITTED LAST*SUBMITTED FIRST****MI*LL99999L~NM1*74*1*CORRECTED LAST*CORRECTED FIRST~REF*EA*PATIENT ACCOUNT NUMBER~DTM*232*20100101~DTM*233*20100101~AMT*AU*34.25~SVC*HC:V2020:RB*6*6**1~DTM*472*20100101~AMT*B6*6~SVC*HC:V2700:RB*2.75*2.75**1~DTM*472*20100101~AMT*B6*2.75~SVC*HC:V2103:RB*5.5*5.5**1~DTM*472*20100101~AMT*B6*5.5~SVC*HC:S0580*20*20**2~DTM*472*20100101~AMT*B6*20~CLP*PATIENT ACCOUNT NUMBER*2*34*0**MC*1000220000000020*11~NM1*QC*1*SUBMITTED LAST*SUBMITTED FIRST****MI*LL88888L~NM1*74*1*CORRECTED LAST*CORRECTED FIRST~REF*EA*PATIENT ACCOUNT NUMBER~DTM*232*20100101~DTM*233*20100101~SVC*HC:V2020*12*0**0~DTM*472*20100101~CAS*CO*29*12~SVC*HC:V2103*22*0**0~DTM*472*20100101~CAS*CO*29*22~CLP*PATIENT ACCOUNT NUMBER*2*34.25*11.5**MC*1000230000000020*11~NM1*QC*1*SUBMITTED LAST*SUBMITTED FIRST****MI*LL77777L~NM1*74*1*CORRECTED LAST*CORRECTED FIRST~REF*EA*PATIENT ACCOUNT NUMBER~DTM*232*20100101~DTM*233*20100101~AMT*AU*11.5~SVC*HC:V2020:RB*6*6**1~DTM*472*20100101~AMT*B6*6~SVC*HC:V2103:RB*5.5*5.5**1~DTM*472*20130917~AMT*B6*5.5~SVC*HC:V2700:RB*2.75*0**0~DTM*472*20100101~CAS*CO*251*2.75~LQ*HE*N206~SVC*HC:S0580*20*0**0~DTM*472*20100101~CAS*CO*251*20~LQ*HE*N206~SE*65*1740~GE*1*6000600~IEA*1*006000600~
My point is you will have deeply nested structure File->Interchange(s) -> Functional group -> Transaction set(s) -> Loop(s) (I may be wrong for some of these, but anyway) and at each nested level you can have either some built-in container like list, dict, tuple, namedtuple etc. or write own class.
What will you choose depends on you - what you plan to do, do you want to validate data, do you plan to expand and so on.
For start very basic example
import pprint
line_sep = '~'
element_sep = '*'
with open(r'.\835\sample835.txt') as f:
x12 = f.read()
x12 = x12.split(line_sep)
message = []
for segment in x12:
if segment.startswith('ISA'):
isa = {} # create empty dict
isa['ISA'] = segment.split(element_sep)
isa['payments'] = []
elif segment.startswith('CLP'):
payment = segment.split(element_sep)
isa['payments'].append(payment)
elif segment.startswith('IEA'):
message.append(isa)
pprint.pprint(message)
Output:
[{'ISA': ['ISA',
'00',
' ',
'00',
' ',
'ZZ',
'EMEDNYBAT ',
'ZZ',
'ETIN ',
'100101',
'1000',
'^',
'00501',
'006000600',
'0',
'T',
':'],
'payments': [['CLP',
'PATIENT ACCOUNT NUMBER',
'1',
'34.25',
'34.25',
'',
'MC',
'1000210000000030',
'11'],
['CLP',
'PATIENT ACCOUNT NUMBER',
'2',
'34',
'0',
'',
'MC',
'1000220000000020',
'11'],
['CLP',
'PATIENT ACCOUNT NUMBER',
'2',
'34.25',
'11.5',
'',
'MC',
'1000230000000020',
'11']]}]
As you can see - list (to allow multiple interchange blocks), each interchange will be a dict, the value for key "payments" is again dict list, holding multiple lists, etc.
I work with just ISA and CLP segments, but I guess you will need to work on other segments/loops too.
From here you can expand, e.g. replace lists for each segment with namedtuple
from collections import namedtuple
import pprint
ISA = namedtuple('ISA', ['identifier', 'authorization_information_qualifier', 'authorization_information',
'security_information_qualifier', 'security_information', 'interchange_id_qualifier_isa5',
'interchange_sender_id', 'interchange_id_qualifier_isa7', 'interchange_receiver_id',
'interchange_date', 'interchange_time', 'interchange_control_standards',
'interchange_control_version_number', 'interchange_control_number',
'acknowledgement_requested', 'usage_indicator', 'component_element_separator'],
defaults=(None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, '>'))
CLP =namedtuple('CLP', ['identifier', 'patient_control_number', 'claim_status_code', 'total_claim_charge_amount',
'claim_payment_amount', 'claim_filing_indicator_code_', 'payer_claim_control_number', 'clp07', 'clp08'])
line_sep = '~'
element_sep = '*'
with open(r'.\835\sample835.txt') as f:
x12 = f.read()
x12 = x12.split(line_sep)
message = []
for segment in x12:
if segment.startswith('ISA'):
isa = {} # create empty dict
isa['ISA'] = ISA(*segment.split(element_sep))
isa['payments'] = []
elif segment.startswith('CLP'):
payment = CLP(*segment.split(element_sep))
isa['payments'].append(payment)
elif segment.startswith('IEA'):
message.append(isa)
pprint.pprint(message)
print('\n')
for isa in message:
for payment in isa['payments']:
print(f'Claim payment amount: {payment.claim_payment_amount}')
Output:
[{'ISA': ISA(identifier='ISA', authorization_information_qualifier='00', authorization_information=' ', security_information_qualifier='00', security_information=' ', interchange_id_qualifier_isa5='ZZ', interchange_sender_id='EMEDNYBAT ', interchange_id_qualifier_isa7='ZZ', interchange_receiver_id='ETIN ', interchange_date='100101', interchange_time='1000', interchange_control_standards='^', interchange_control_version_number='00501', interchange_control_number='006000600', acknowledgement_requested='0', usage_indicator='T', component_element_separator=':'),
'payments': [CLP(identifier='CLP', patient_control_number='PATIENT ACCOUNT NUMBER', claim_status_code='1', total_claim_charge_amount='34.25', claim_payment_amount='34.25', claim_filing_indicator_code_='', payer_claim_control_number='MC', clp07='1000210000000030', clp08='11'),
CLP(identifier='CLP', patient_control_number='PATIENT ACCOUNT NUMBER', claim_status_code='2', total_claim_charge_amount='34', claim_payment_amount='0', claim_filing_indicator_code_='', payer_claim_control_number='MC', clp07='1000220000000020', clp08='11'),
CLP(identifier='CLP', patient_control_number='PATIENT ACCOUNT NUMBER', claim_status_code='2', total_claim_charge_amount='34.25', claim_payment_amount='11.5', claim_filing_indicator_code_='', payer_claim_control_number='MC', clp07='1000230000000020', clp08='11')]}]
Claim payment amount: 34.25
Claim payment amount: 0
Claim payment amount: 11.5
In addition to above, which is my code, I found this
https://hiplab.mc.vanderbilt.edu/git/lab/parse-edi
It's not great in terms of quality of python code, easy of installation, etc. I tried to run it but was not very successful with the sample file. Anyway - it may be useful and give you some additional hints if you decide to look at it further.
That's it for now. I apologise if it happened to use incorrect terminology here and there.