Python Forum

Full Version: csv.reader(): Limit the number of columns read in Windows
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I don't use Windows, but I am trying to show someone how to read and write .csv files

In Ubuntu, when I do this:

import csv
file1 = '/home/pedro/winter2020/20PY/dumpFiles/attendanceWinter2020_20PY.csv'
infile = open(file1, 'rb')
data1 = csv.reader(infile)

# show the data

for record in data1:
    print(record)
This only reads columns which have data, in this case 33 columns.

column 34 (and thereafter) contains no data in any row, so the reader stops.

Result: each record is 33 items long.

When I try this in Windows 10, Python 3.8.2, the length of each record is 1025!! (only the first 33 contain data)

How can I limit the number of columns csv.reader() reads in Windows 10??

OH, and Happy Christmas to everyone! (It's just a normal workday here in China!)
I don't have Windows either but could it be that the file has Linux line endings b'\n' instead of Windows line endings b'\r\n' ? In that case, you could perhaps first convert the file to change the line ending.
Try it this way:
import csv

file1 = '/home/pedro/winter2020/20PY/dumpFiles/attendanceWinter2020_20PY.csv'
with open(file1) as fp:
    crdr = csv.reader(fp, delimiter = ',')
    for data1 in crdr:
        print(data1)
or if there is a header and you want a dictionary item:
import csv

file1 = '/home/pedro/winter2020/20PY/dumpFiles/attendanceWinter2020_20PY.csv'
with open(file1) as fp:
    crdr = csv.DictReader(fp, delimiter = ',')
    for row in crdr:
        for key, value in row:
            print(f"{key}: {value}")
It is hard to help resolve your issue without at least a few lines of input from your file.

If you can paste two or three lines of your file into a reply (use the "Insert Output" button to surround the lines) we may be able to help you discover the issue.

Peter
Thanks for the replies during the holiday, I certainly wasn't expecting a response yesterday!

I checked the csv file with .readlines() Each line ends with a newline character, not \r\n.

I just tried again in Windows and this is the output in Idle:

>>> import csv
>>> path1 = 'C:/Users/small/Desktop/attendanceWinter2020_20PY.csv'
>>> infile = open(path1)
>>> data1 = csv.reader(infile)
>>> mydatalist = []
>>> for record in data1:
    mydatalist.append(record)

    
>>> print('mydatalist is: ' + str(len(mydatalist)) + ' long')
mydatalist is: 26 long
>>> a_row = mydatalist[4]
>>> len(a_row)
1025
>>> infile = open(path1)
>>> data1 = csv.reader(infile, delimiter=',')
>>> mydatalist = []
>>> for record in data1:
    mydatalist.append(record)

    
>>> a_row = mydatalist[4]
>>> len(a_row)
1025
>>> a_row
['', '', '1921090147', '郜阳杰', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
 '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
 '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' (this goes on until the list is 1025 items long)
I can successfully extract the data I want.

But why is the row output 1025 long, when maximally 33 columns contain data?

This does not happen in Linux!
Try open(path1, newline='\n') perhaps.
Thanks for the example data line. When I try to run your original code (which opens the file in binary, mode "rb") in WSL Ubuntu Linux 20.04 under Win10-64 (python version 3.8.5) it gives this error:

Error:
Traceback (most recent call last): File "csvtest1.py", line 8, in <module> for record in data1: _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
I get the same error at a regular WIn-10 console prompt (my Windows python is also 3.8.5).

When I replace the Chinese utf-8 characters in your example with simple ASCII characters (I just used ABCD) and change the file open in your code to just "r" I get 1025 results, not 33.

Are there in fact 1025 commas in your input file? That implies a csv "data length" of 1025 columns, not 33. CSV reading (unless modified) will supply an empty field ("'', " in your example output) for every comma (or every occurrence of the "field separator" character that you set) which has no data preceding it.

Are you reading in binary in order to consume the multi-byte values of the Chinese characters in your example? If that is the case, what did you set the CSV field separator value to? (I know the python csv reader has an option to do that).

Perhaps the combination of reading in binary and supplying a csv field separator value of something other than comma is causing your difference?

Just guessing here. It seems we do not have enough information about your environment to completely figure out what is going on. What is your locale setting? What Linux distribution? Is there other code preceding your short example that sets the csv field separator and/or record separator values to other than the default values?

How are you getting the csv reader to read your data file when the file is opened in binary instead of text format?

If you could upload an extract of one record of your actual input file here as a "data.txt" file, perhaps a hex edit of that one record would tell us quite a lot.

Here is the ASCII-fied copy of your data file that I created, which has 1025 commas:

Output:
,,1921090147,ABCD,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
HTH

Peter
Well, thanks for trying!

Don't spoil your holiday!

I opened the csv file with gedit, a Linux text editor, if you know it.

These are the contents of my .csv file. No line is 1025 items long!!:

Quote:,,,,max Attendance,,,Max Score,425,,,,50,25,,50,25,,50,25,,50,25,,50,25,,0,25,,0,25,
,,,23名,8,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Photo,phone,学号,姓名,出勤,出勤率,期末分数,得分率,得分,HW,CW,Week 16,HW,CW,Week 15,HW,CW,Week 14,HW,CW,Week 13,HW,CW,Week 12,HW,CW,Week 11,HW,CW,Week 10,HW,CW,Week 9
,,1921030146,聂林,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1921090147,郜阳杰,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1922190144,汪毓韬,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1923010312,张长乐,7,100,100,100,425,,,,50,25,1,50,25,1,50,25,1,50,25,1,50,25,1,0,25,1,0,25,1
,,1925010121,陈尚,4,57.1,47.6,41.2,175,,,,0,25,1,50,25,1,0,0,0,0,25,0,0,25,1,0,25,1,0,0,0
,,1926070127,蒲瑞杰,6,85.7,83.7,82.4,350,,,,50,25,1,50,25,1,50,25,1,50,25,1,0,25,1,0,25,1,0,0,0
,,1926070150,张旭,1,14.3,9.3,5.9,25,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,25,1,0,0,0,0,0,0
,,1926080140,陈子华,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1926080205,王天亮,5,71.4,56.8,47.1,200,,,,0,25,0,0,25,1,0,25,1,0,25,1,50,25,1,0,25,1,0,0,0
,,1926080209,闫怀壮,3,42.9,27.7,17.6,75,,,,0,0,0,0,0,0,0,0,0,0,25,1,0,25,1,0,25,1,0,0,0
,,1926080216,朱增援,6,85.7,84.4,83.5,355,,,,50,25,0,40,25,1,0,25,1,40,25,1,50,25,1,0,25,1,0,25,1
,,1926080230,杨澜,6,85.7,62.5,47.1,200,,,,0,25,1,0,25,1,0,25,1,0,25,1,50,25,1,0,25,1,0,0,0
,,1926080234,张建军,6,85.7,62.5,47.1,200,,,,0,25,1,0,25,1,0,25,1,0,25,1,50,25,1,0,25,1,0,0,0
,,1926080243,游杰,7,100,100,100,425,,,,50,25,1,50,25,1,50,25,1,50,25,1,50,25,1,0,25,1,0,25,1
,,1926080254,王润迪,6,85.7,62.5,47.1,200,,,,0,25,1,0,25,1,0,0,0,0,25,1,50,25,1,0,25,1,0,25,1
,,1927040125,黄楚兰,7,100,91.5,85.9,365,,,,50,25,1,50,25,1,0,25,1,40,25,1,50,25,1,0,25,1,0,25,1
,,1928040103,曹予曦,5,71.4,46.2,29.4,125,,,,0,0,0,0,25,1,0,0,0,0,25,1,0,25,1,0,25,1,0,25,1
,,1928040213,严昕,5,71.4,53.3,41.2,175,,,,0,25,1,0,25,1,0,25,1,0,25,1,50,25,1,0,0,0,0,0,0
,,1929020321,伍竹,5,71.4,46.2,29.4,125,,,,0,25,1,0,0,0,0,25,1,0,25,1,0,25,1,0,25,1,0,0,0
,,1929020325,张清源,6,85.7,55.5,35.3,150,,,,0,25,1,0,0,0,0,25,1,0,25,1,0,25,1,0,25,1,0,25,1
,,1929020329,章明慧,7,100,57.6,29.4,125,,,,0,25,1,0,25,1,0,0,1,0,25,1,0,25,1,0,25,1,0,0,1
,,1929030133,丁桃园,7,100,70.4,50.6,215,,,,0,25,1,40,25,1,0,25,1,0,25,1,0,25,1,0,25,1,0,25,1
,,1929050140,石楷,7,100,90.1,83.5,355,,,,50,25,1,0,25,1,50,25,1,50,25,1,30,25,1,0,25,1,0,25,1
Microsoft Excel has a maximum of 255 columns. Importing a CSV record with more than 255 fields can cause some versions of Microsoft Excel to exhibit strange behavior. That is probably why you're only able to do a maximum of 255 fields with the OleDBDataAdapter.

Regards,
Caleb
Using a copy of the file you pasted in your last message stored on my machine in "utf-8" encoding, this seems to work:

import csv

file1 = 'C:/Users/MyUserId/envtest/testutf.csv'
with open(file1, encoding="utf-8") as fp:
    crdr = csv.reader(fp, delimiter = ',')
    for data1 in crdr:
        print(data1)
Output:
['', '', '', '', 'max Attendance', '', '', 'Max Score', '425', '', '', '', '50', '25', '', '50', '25', '', '50', '25', '', '50', '25', '', '50', '25', '', '0', '25', '', '0', '25', ''] ['', '', '', '23名', '8', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] ['Photo', 'phone', '学号', '姓名', '出勤', '出勤率', '期末分数', '得分率', '得分', 'HW', 'CW', 'Week 16', 'HW', 'CW', 'Week 15', 'HW', 'CW', 'Week 14', 'HW', 'CW', 'Week 13', 'HW', 'CW', 'Week 12', 'HW', 'CW', 'Week 11', 'HW', 'CW', 'Week 10', 'HW', 'CW', 'Week 9'] ['', '', '1921030146', '聂林', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] ['', '', '1921090147', '郜阳杰', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] ['', '', '1922190144', '汪毓韬', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] ['', '', '1923010312', '张长乐', '7', '100', '100', '100', '425', '', '', '', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1925010121', '陈尚', '4', '57.1', '47.6', '41.2', '175', '', '', '', '0', '25', '1', '50', '25', '1', '0', '0', '0', '0', '25', '0', '0', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926070127', '蒲瑞杰', '6', '85.7', '83.7', '82.4', '350', '', '', '', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926070150', '张旭', '1', '14.3', '9.3', '5.9', '25', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '25', '1', '0', '0', '0', '0', '0', '0'] ['', '', '1926080140', '陈子华', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] ['', '', '1926080205', '王天亮', '5', '71.4', '56.8', '47.1', '200', '', '', '', '0', '25', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '50', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926080209', '闫怀壮', '3', '42.9', '27.7', '17.6', '75', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926080216', '朱增援', '6', '85.7', '84.4', '83.5', '355', '', '', '', '50', '25', '0', '40', '25', '1', '0', '25', '1', '40', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1926080230', '杨澜', '6', '85.7', '62.5', '47.1', '200', '', '', '', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '50', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926080234', '张建军', '6', '85.7', '62.5', '47.1', '200', '', '', '', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '50', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926080243', '游杰', '7', '100', '100', '100', '425', '', '', '', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1926080254', '王润迪', '6', '85.7', '62.5', '47.1', '200', '', '', '', '0', '25', '1', '0', '25', '1', '0', '0', '0', '0', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1927040125', '黄楚兰', '7', '100', '91.5', '85.9', '365', '', '', '', '50', '25', '1', '50', '25', '1', '0', '25', '1', '40', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1928040103', '曹予曦', '5', '71.4', '46.2', '29.4', '125', '', '', '', '0', '0', '0', '0', '25', '1', '0', '0', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1928040213', '严昕', '5', '71.4', '53.3', '41.2', '175', '', '', '', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '50', '25', '1', '0', '0', '0', '0', '0', '0'] ['', '', '1929020321', '伍竹', '5', '71.4', '46.2', '29.4', '125', '', '', '', '0', '25', '1', '0', '0', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1929020325', '张清源', '6', '85.7', '55.5', '35.3', '150', '', '', '', '0', '25', '1', '0', '0', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1929020329', '章明慧', '7', '100', '57.6', '29.4', '125', '', '', '', '0', '25', '1', '0', '25', '1', '0', '0', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '0', '1'] ['', '', '1929030133', '丁桃园', '7', '100', '70.4', '50.6', '215', '', '', '', '0', '25', '1', '40', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1929050140', '石楷', '7', '100', '90.1', '83.5', '355', '', '', '', '50', '25', '1', '0', '25', '1', '50', '25', '1', '50', '25', '1', '30', '25', '1', '0', '25', '1', '0', '25', '1']
I hope that helps you.

Peter