Here is how I would do it (probably :-), if I don't go OOP)
Using
namedtuple from collections module to make the code more readable
from collections import namedtuple, OrderedDict
import csv
# define namedtuple Patient. It will represent individual patient data (i.e. single row from csv)
Patient = namedtuple('Patient', ['id', 'hist_glucose', 'scan_glucose'])
# read the csv file
with open('summary.txt') as f:
reader = csv.DictReader(f, delimiter=',')
# this will be OrderedDict to hold all data
patients = OrderedDict()
# iterate over file
for row in reader:
patient_id = int(row['ID']) # convert id to int
# try to convert Historic Glucose
try:
hist_glucose = int(row['Historic Glucose (mg/dL)'])
except ValueError:
hist_glucose = 0 # default value of 0
# try to convert Scan Glucose
try:
scan_glucose = int(row['Scan Glucose (mg/dL)'])
except ValueError:
scan_glucose = 0
# add current patient to patients
patients[patient_id] = Patient(patient_id, hist_glucose, scan_glucose)
# iterate over patients and print each patient
for patient in patients.values():
print(patient)
# calculate number of patients
num_patients = len(patients)
print('Number of patients: {}'.format(num_patients))
# calculate average historic glucose
ave_historic = sum(patient.hist_glucose for patient in patients.values())/num_patients
print('Historic Glucose average : {:.3f}'.format(ave_historic))
# calculate average scan glucose
ave_scan = sum(patient.scan_glucose for patient in patients.values())/num_patients
print('Scan Glucose average : {:.3f}'.format(ave_scan))
# access individual patient data
print('Patient {id}, Historic Glucose: {hist_glucose} (mg/dL)'.format(**patients[132]._asdict()))
assuming this is the sample input file (note there is bad record, id=135)
Output:
ID,Time,Record Type,Historic Glucose (mg/dL),Scan Glucose (mg/dL),Non-numeric Rapid-Acting Insulin,Rapid-Acting Insulin (units),Non-numeric Food,Carbohydrates (grams),Non-numeric Long-Acting Insulin,Long-Acting Insulin (units),Notes,Strip Glucose (mg/dL),Ketone (mmol/L),N/A,Previous Time,Updated Time
132,2018/07/28 01:41,0,141,,,,,,,,,,,,,,,
133,2018/07/28 01:56,0,133,,,,,,,,,,,,,,,
134,2018/07/28 02:11,0,126,,,,,,,,,,,,,,,
135,2018/07/28 02:27,1,,123,,,,,,,,,,,,,,
137,2018/07/28 02:27,0,126,,,,,,,,,,,,,,,
138,2018/07/28 02:42,0,119,,,,,,,,,,,,,,,
139,2018/07/28 02:57,0,96,,,,,,,,,,,,,,,
the output from running the script is
Output:
Patient(id=132, hist_glucose=141, scan_glucose=0)
Patient(id=133, hist_glucose=133, scan_glucose=0)
Patient(id=134, hist_glucose=126, scan_glucose=0)
Patient(id=135, hist_glucose=0, scan_glucose=123)
Patient(id=137, hist_glucose=126, scan_glucose=0)
Patient(id=138, hist_glucose=119, scan_glucose=0)
Patient(id=139, hist_glucose=96, scan_glucose=0)
Number of patients: 7
Historic Glucose average : 105.857
Scan Glucose average : 17.571
Patient 132, Historic Glucose: 141 (mg/dL)
of course more improvements are possible (there is a lot of room for improvements or doing things differently), but for start this also works