![]() |
Help iterating through DictReader loaded from csv - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Help iterating through DictReader loaded from csv (/thread-9859.html) |
Help iterating through DictReader loaded from csv - Huck - May-01-2018 I have a comma delimited csv file loaded using DictReader. csv... person_id,type,arrive_date,leave_date 90,check_in,2/15/2018,2/15/2018 90,brunch,2/15/2018,2/15/2018 90,lunch ,2/15/2018,2/15/2018 90,cancelled,2/16/2018,2/16/2018 90,breakfast ,2/15/2018,2/22/2018 80,,, 40,check_in,2/15/2018,2/15/2018 50,check_in,2/15/2018,3/1/2018 50,breakfast ,2/15/2018,2/26/2018 50,lunch ,2/15/2018,3/1/2018 60,check_in,2/15/2018,2/15/2018 60,dinner,2/15/2018,2/15/2018 60,lunch ,2/21/2018,2/21/2018 60,breakfast ,3/15/2018,3/15/2018 35,check_in,3/15/2018,3/15/2018 35,cancelled,3/20/2018,3/20/2018 35,cancelled,3/21/2018,3/21/2018 Code to read... import csv with open(r'C:\Users\delliott\Desktop\pythoncsv\Q3\eat.csv', 'rt') as f: reader = csv.DictReader(f, delimiter=',') for row in reader:Looks like... OrderedDict([('person_id', '90'), ('type', 'check_in'), ('arrive_date', '2/15/2018'), ('leave_date', '2/15/2018')]) OrderedDict([('person_id', '90'), ('type', 'brunch'), ('arrive_date', '2/15/2018'), ('leave_date', '2/15/2018')]) etc... Header: ['person_id', 'type', 'arrive_date', 'leave_date'] I need 2 counts… eat_count no_eat_count If a person_id has a visit entry other than check_in and other than cancelled, the eat_count is incremented. A person_id with no entries in the other columns does not count. person_id 80 has no values in the other columns and does not count here. If a person_id has only visits equal check_in and/or cancelled, the no_eat_count is incrememented. person_id 80 counts here. The output should be… eat_count = 3 no_eat_count = 3 This is the breakdown for each person_id for clarification. It’s not needed in the output. 90 - eat_count (has types other than check_in and cancelled) 80 - no_eat_count (blank entries) 40 - no_eat_count (check_in only) 50 - eat_count 60 - eat_count 35 - no_eat_count (check_in and cancelled only) I need something like the below code. I'd like an inner loop through each person_id within the for row in reader loop. import csv with open(r'C:\eat.csv', 'rt') as f: reader = csv.DictReader(f, delimiter=',') for row in reader: #I need something like the below #this is not correct for k in row['person_id']: if 'check_in' not in row ['type']: if 'cancelled' not in row ['type']: eat_count += 1 break else: no_eat_count += 1 print('eat count = ' + str(eat_count)) print('no eat count = ' + str(no_eat_count))This is wordier than I would like. I tried to make it concise. I need the two counts and help with that inner loop on person_id. Thanks. Additional comment... the two date columns are not used in this code. I probably should have excluded them in this question but will use them later. RE: Help iterating through DictReader loaded from csv - Larz60+ - May-01-2018 This is much easier to deal with as a plain list. Here's the two methods compared: import csv def read_file1(filename): with open(filename, 'r') as f: # with open(r'C:\Users\delliott\Desktop\pythoncsv\Q3\eat.csv', 'rt') as f: reader = csv.DictReader(f, delimiter=',') for row in reader: print(row) def read_file2(filename): with open(filename, 'r') as f: reader = csv.reader(f) for row in reader: print(row) if __name__ == '__main__': print('\n\nUsing read_file1') read_file1('eat.csv') print('\n\nUsing read_file2') read_file2('eat.csv')results:
RE: Help iterating through DictReader loaded from csv - Huck - May-01-2018 Thanks, Larz60+. I'll try the list. I'm new to Python and am still figuring out how to specify the columns in the code. I don't know if I use header name or an index at this point. import csv eat_count = 0 no_eat_count = 0 with open(r'C:\Users\delliott\Desktop\pythoncsv\Q3\eat.csv', 'r') as f: reader = csv.reader(f) for row in reader: #the inner loop is still incorrect #I'm not sure if I can use the column names instead of indexes with lists for k in row['person_id']: if 'check_in' not in row ['type']: if 'cancelled' not in row ['type']: eat_count += 1 break else: no_eat_count += 1 print('eat count = ' + str(eat_count)) print('no eat count = ' + str(no_eat_count))I'll appreciate any suggestions using either the list or OrderedDict. I'll later try using Pandas. OK, I see how to specify the columns using indexes... row [1]. Now I have... import csv eat_count = 0 no_eat_count = 0 with open(r'C:\Users\delliott\Desktop\pythoncsv\Q3\eat.csv', 'r') as f: reader = csv.reader(f) next(reader, None) # skip the headers for row in reader: #the inner loop is still incorrect #I don't know how to loop through each person_id for k in row[0]: if 'check_in' not in row [1]: if 'cancelled' not in row [1]: if row [1] != "": eat_count += 1 print(row) break else: no_eat_count += 1 print('eat count = ' + str(eat_count)) print('no eat count = ' + str(no_eat_count))The incorrect output... eat count = 8 no eat count = 9 It should be... eat count = 3 no eat count = 3 The rows counted in eat count... ['90', 'brunch', '2/15/2018', '2/15/2018'] ['90', 'lunch ', '2/15/2018', '2/15/2018'] ['90', 'breakfast ', '2/15/2018', '2/22/2018'] ['50', 'breakfast ', '2/15/2018', '2/26/2018'] ['50', 'lunch ', '2/15/2018', '3/1/2018'] ['60', 'dinner', '2/15/2018', '2/15/2018'] ['60', 'lunch ', '2/21/2018', '2/21/2018'] ['60', 'breakfast ', '3/15/2018', '3/15/2018'] Each person_id should only be counted once. I'm not sure how to do that inner loop on person_id in Python. I was attempting to loop through and then break out of the inner loop once the eat_count is incremented. Can I get any suggestions? RE: Help iterating through DictReader loaded from csv - Larz60+ - May-01-2018 To separate into columns, you could do something like: import csv def read_file(filename): column_titles = None with open(filename, 'r') as f: reader = csv.reader(f) for row in reader: for col in row: print('{0:<15} '.format(col), end='') print() if __name__ == '__main__': read_file('eat.csv')which will yield:
RE: Help iterating through DictReader loaded from csv - Huck - May-04-2018 Thanks for the replies. They will be helpful in the long run. I'm still not sure how to handle the loops to get the totals I need... eat_count - A person_id has a visit entry other than check_in and other than cancelled, the eat_count is incremented. A person_id with no entries in the other columns does not count. person_id 80 has no values in the other columns and does not count here. no_eat_count a person_id has only visits equal check_in and/or cancelled, the no_eat_count is incremented. person_id 80 counts here. Each person_id should only have one count. The output should be… eat_count = 3 no_eat_count = 3 I know it should be simple, but I don't yet grasp how to work with lists in Python. Can anyone help? Thanks. |