I have a comma delimited csv file loaded using DictReader.
csv...
person_id,type,arrive_date,leave_date
90,check_in,2/15/2018,2/15/2018
90,brunch,2/15/2018,2/15/2018
90,lunch ,2/15/2018,2/15/2018
90,cancelled,2/16/2018,2/16/2018
90,breakfast ,2/15/2018,2/22/2018
80,,,
40,check_in,2/15/2018,2/15/2018
50,check_in,2/15/2018,3/1/2018
50,breakfast ,2/15/2018,2/26/2018
50,lunch ,2/15/2018,3/1/2018
60,check_in,2/15/2018,2/15/2018
60,dinner,2/15/2018,2/15/2018
60,lunch ,2/21/2018,2/21/2018
60,breakfast ,3/15/2018,3/15/2018
35,check_in,3/15/2018,3/15/2018
35,cancelled,3/20/2018,3/20/2018
35,cancelled,3/21/2018,3/21/2018
Code to read...
Looks like...
OrderedDict([('person_id', '90'), ('type', 'check_in'), ('arrive_date', '2/15/2018'), ('leave_date', '2/15/2018')])
OrderedDict([('person_id', '90'), ('type', 'brunch'), ('arrive_date', '2/15/2018'), ('leave_date', '2/15/2018')])
etc...
Header: ['person_id', 'type', 'arrive_date', 'leave_date']
I need 2 counts…
eat_count
no_eat_count
If a person_id has a visit entry other than check_in and other than cancelled, the eat_count is incremented. A person_id with no entries in the other columns does not count. person_id 80 has no values in the other columns and does not count here.
If a person_id has only visits equal check_in and/or cancelled, the no_eat_count is incrememented. person_id 80 counts here.
The output should be…
eat_count = 3
no_eat_count = 3
This is the breakdown for each person_id for clarification. It’s not needed in the output.
90 - eat_count (has types other than check_in and cancelled)
80 - no_eat_count (blank entries)
40 - no_eat_count (check_in only)
50 - eat_count
60 - eat_count
35 - no_eat_count (check_in and cancelled only)
I need something like the below code. I'd like an inner loop through each person_id within the for row in reader loop.
This is wordier than I would like. I tried to make it concise. I need the two counts and help with that inner loop on person_id.
Thanks.
Additional comment... the two date columns are not used in this code. I probably should have excluded them in this question but will use them later.
csv...
person_id,type,arrive_date,leave_date
90,check_in,2/15/2018,2/15/2018
90,brunch,2/15/2018,2/15/2018
90,lunch ,2/15/2018,2/15/2018
90,cancelled,2/16/2018,2/16/2018
90,breakfast ,2/15/2018,2/22/2018
80,,,
40,check_in,2/15/2018,2/15/2018
50,check_in,2/15/2018,3/1/2018
50,breakfast ,2/15/2018,2/26/2018
50,lunch ,2/15/2018,3/1/2018
60,check_in,2/15/2018,2/15/2018
60,dinner,2/15/2018,2/15/2018
60,lunch ,2/21/2018,2/21/2018
60,breakfast ,3/15/2018,3/15/2018
35,check_in,3/15/2018,3/15/2018
35,cancelled,3/20/2018,3/20/2018
35,cancelled,3/21/2018,3/21/2018
Code to read...
1 2 3 4 |
import csv with open (r 'C:\Users\delliott\Desktop\pythoncsv\Q3\eat.csv' , 'rt' ) as f: reader = csv.DictReader(f, delimiter = ',' ) for row in reader: |
OrderedDict([('person_id', '90'), ('type', 'check_in'), ('arrive_date', '2/15/2018'), ('leave_date', '2/15/2018')])
OrderedDict([('person_id', '90'), ('type', 'brunch'), ('arrive_date', '2/15/2018'), ('leave_date', '2/15/2018')])
etc...
Header: ['person_id', 'type', 'arrive_date', 'leave_date']
I need 2 counts…
eat_count
no_eat_count
If a person_id has a visit entry other than check_in and other than cancelled, the eat_count is incremented. A person_id with no entries in the other columns does not count. person_id 80 has no values in the other columns and does not count here.
If a person_id has only visits equal check_in and/or cancelled, the no_eat_count is incrememented. person_id 80 counts here.
The output should be…
eat_count = 3
no_eat_count = 3
This is the breakdown for each person_id for clarification. It’s not needed in the output.
90 - eat_count (has types other than check_in and cancelled)
80 - no_eat_count (blank entries)
40 - no_eat_count (check_in only)
50 - eat_count
60 - eat_count
35 - no_eat_count (check_in and cancelled only)
I need something like the below code. I'd like an inner loop through each person_id within the for row in reader loop.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
import csv with open (r 'C:\eat.csv' , 'rt' ) as f: reader = csv.DictReader(f, delimiter = ',' ) for row in reader: #I need something like the below #this is not correct for k in row[ 'person_id' ]: if 'check_in' not in row [ 'type' ]: if 'cancelled' not in row [ 'type' ]: eat_count + = 1 break else : no_eat_count + = 1 print ( 'eat count = ' + str (eat_count)) print ( 'no eat count = ' + str (no_eat_count)) |
Thanks.
Additional comment... the two date columns are not used in this code. I probably should have excluded them in this question but will use them later.