Posts: 13
Threads: 3
Joined: Apr 2018
Comma delimited file:
ID Colors
1 red
2 red
3 blue
4 red
5 blue
6 blue
7 red
8 blue
9 blue
10 blue
My code reads the above comma delimited file and tallies the counts for each color:
import csv
from datetime import datetime
blue_count = 0
red_count = 0
blues = []
reds = []
with open(r'C:\Users\delliott\Desktop\SDA Update\colors.csv', 'r') as f:
reader = csv.reader(f)
next(reader, None) # skip the headers
for row in reader:
for k in row[0]:
if 'blue' in row [1]:
blue_count += 1
blues.append(row[0])
break
with open(r'C:\Users\delliott\Desktop\SDA Update\colors.csv', 'r') as f:
reader = csv.reader(f)
next(reader, None) # skip the headers
for row in reader:
for k in row[0]:
if 'red' in row [1]:
red_count += 1
reds.append(row[0])
break
print('blue count ' + str(blue_count))
print('red count ' + str(red_count)) This is an oversimplification of a much longer program that will need to use the same data for multiple loops. I could easily use an if/else for the above in one loop. I am new to Python and want to learn the best way to reset (restart) the csv.reader. The above is just a mockup.
How can I reset the csv.reader for the second loop instead of loading the csv file again?
Thanks.
Posts: 3,458
Threads: 101
Joined: Sep 2016
You're reading a file, so using seek() can let you choose where in the file you are: https://docs.python.org/3/library/io.htm...OBase.seek
import csv
blue = 0
with open("my-file.csv") as f:
reader = csv.reader(f)
# skip headers
next(reader)
for row in reader:
if "blue" in row:
blue += 1
if blue > 5:
f.seek(0)
# skip headers again
next(reader)
if blue > 50:
break
print(row)
print(f"blue={blue}") Output: ['1', 'red']
['2', 'red']
['3', 'blue']
['4', 'red']
['5', 'blue']
['6', 'blue']
['7', 'red']
['8', 'blue']
['9', 'blue']
['10', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
['3', 'blue']
['1', 'red']
['2', 'red']
blue=51
Posts: 13
Threads: 3
Joined: Apr 2018
Output:
blue count 6
red count 4
Posts: 8,090
Threads: 154
Joined: Sep 2016
Jul-30-2018, 04:20 PM
(This post was last modified: Jul-30-2018, 04:20 PM by buran.)
No need to loop multiple times, either by opening file multiple times or using seek
import csv
colors = {'red':{'count':0, 'rows':[]}, 'blue':{'count':0, 'rows':[]}}
with open('colors.csv') as f:
rdr = csv.DictReader(f)
for row in rdr:
colors[row['Colors']]['count'] += 1
colors[row['Colors']]['rows'].append(row)
print(colors) Output: {'red': {'count': 4, 'rows': [{'Colors': 'red', 'ID': '1'}, {'Colors': 'red', 'ID': '2'}, {'Colors': 'red', 'ID': '4'}, {'Colors': 'red', 'ID': '7'}]}, 'blue': {'count': 6, 'rows': [{'Colors': 'blue', 'ID': '3'}, {'Colors': 'blue', 'ID': '5'}, {'Colors': 'blue', 'ID': '6'}, {'Colors': 'blue', 'ID': '8'}, {'Colors': 'blue', 'ID': '9'}, {'Colors': 'blue', 'ID': '10'}]}}
Posts: 13
Threads: 3
Joined: Apr 2018
Jul-30-2018, 04:42 PM
(This post was last modified: Jul-30-2018, 04:42 PM by Huck.)
Thanks, nilamo.
I might have oversimplified too much. I won't know the number of rows in the files I will be using. I will need to iterate through all of the rows to get all of the blues and then iterate again to get all of the reds. If there is a way to check for the last row (instead of > 5 or > 50) and then use seek (0), I might be able to get it to work. If not, I might have to try another approach. It's also possible that I am not interpreting your answer correctly.
Is there a way to check for the last row before closing the file?
Thanks, buran. That will work with the example I provided. But I will need to reset with the actual data I'll be using.
I got it. Yes, I was misinterpreting.
import csv
import xlwt
import xlsxwriter
from datetime import datetime
blue_count = 0
red_count = 0
blues = []
reds = []
with open(r'C:\Users\delliott\Desktop\SDA Update\colors.csv', 'r') as f:
reader = csv.reader(f)
next(reader, None) # skip the headers
for row in reader:
for k in row[0]:
if 'blue' in row [1]:
blue_count += 1
blues.append(row[0])
break
f.seek(0)
next(reader, None) # skip the headers
for row in reader:
for k in row[0]:
if 'red' in row [1]:
red_count += 1
reds.append(row[0])
break
print('blue count ' + str(blue_count))
print('red count ' + str(red_count)) Output:
blue count 6
red count 4
Thank you.
Posts: 8,090
Threads: 154
Joined: Sep 2016
Jul-30-2018, 04:51 PM
(This post was last modified: Jul-30-2018, 04:51 PM by buran.)
(Jul-30-2018, 04:42 PM)Huck Wrote: Thanks, buran. That will work with the example I provided. But I will need to reset with the actual data I'll be using. I don't possibly see a reason to reset, whatever the data is. It may just require to change the code here and there. Would you care to elaborate
Also, why would you iterate over the id and break after first loop?
for k in row[0]:
if 'blue' in row [1]:
blue_count += 1
blues.append(row[0])
break is same as
if 'blue' in row [1]:
blue_count += 1
blues.append(row[0])
Posts: 13
Threads: 3
Joined: Apr 2018
Jul-30-2018, 05:28 PM
(This post was last modified: Jul-30-2018, 05:28 PM by Huck.)
(Jul-30-2018, 04:51 PM)buran Wrote: I don't possibly see a reason to reset, whatever the data is. It may just require to change the code here and there. Would you care to elaborate Also, why would you iterate over the id and break after first loop?
I kept getting blue = 7 until I put that break in the first loop. I didn't think I needed it either.
The actual data I am using is similar to another thread I started:
Help iterating through DictReader loaded from csv
It's hard to describe what I am actually doing. I tried that on Stack Overflow and only got negative responses. I am reading output from a SQL query that will have client IDs with multiple entries. I'll see if I can come up with a clear explanation and post the actual query I'm using later. There probably is a better way in Python. I'm still learning.
Thank you for taking the time to look and read the question.
Posts: 8,090
Threads: 154
Joined: Sep 2016
(Jul-30-2018, 05:28 PM)Huck Wrote: I kept getting blue = 7 until I put that break in the first loop. I didn't think I needed it either. It was the loop that is a problem, the break was a way to solve the problem caused by the unnecessary loop. You were getting 7, because with the loop if id is double-digit, it will count it twice, i.e. every double-digit id is double-counted, if you had 3-digit ids, they will be triple-counted, etc.
try to find a way to explain what you are doing and let us know. Also it MAY be possible to fix the sql statement itself.
Posts: 13
Threads: 3
Joined: Apr 2018
(Jul-30-2018, 05:34 PM)buran Wrote: It was the loop that is a problem, the break was a way to solve the problem caused by the unnecessary loop. You were getting 7, because with the loop if id is double-digit, it will count it twice, i.e. every double-digit id is double-counted, if you had 3-digit ids, they will be triple-counted, etc.
try to find a way to explain what you are doing and let us know. Also it MAY be possible to fix the sql statement itself.
You are right. I tried it with multi IDs with double digits and again with 3-digits. I didn't doubt you. I just wanted to see it. I don't get why it works that way. It's comma delimited. I thought a double digit would be one iteration.
It will take a while to get an explanation typed up. I'll work on it after I finish some other things I have to complete.
Thanks.
Posts: 8,090
Threads: 154
Joined: Sep 2016
Jul-30-2018, 06:15 PM
(This post was last modified: Jul-30-2018, 07:35 PM by buran.)
(Jul-30-2018, 05:53 PM)Huck Wrote: I don't get why it works that way. It's comma delimited. I thought a double digit would be one iteration. what you read from the file is strings, i.e. each field is a string. So what you think is int 10 is actually '10'.
So when you iterate over stings you get one char at a time.
id = '12345'
for ch in id:
print(ch)
id = 'Huck'
for ch in id:
print(ch)
|