Python Forum
csv.reader(): Limit the number of columns read in Windows
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
csv.reader(): Limit the number of columns read in Windows
#1
I don't use Windows, but I am trying to show someone how to read and write .csv files

In Ubuntu, when I do this:

import csv
file1 = '/home/pedro/winter2020/20PY/dumpFiles/attendanceWinter2020_20PY.csv'
infile = open(file1, 'rb')
data1 = csv.reader(infile)

# show the data

for record in data1:
    print(record)
This only reads columns which have data, in this case 33 columns.

column 34 (and thereafter) contains no data in any row, so the reader stops.

Result: each record is 33 items long.

When I try this in Windows 10, Python 3.8.2, the length of each record is 1025!! (only the first 33 contain data)

How can I limit the number of columns csv.reader() reads in Windows 10??

OH, and Happy Christmas to everyone! (It's just a normal workday here in China!)
Reply
#2
I don't have Windows either but could it be that the file has Linux line endings b'\n' instead of Windows line endings b'\r\n' ? In that case, you could perhaps first convert the file to change the line ending.
Pedroski55 likes this post
Reply
#3
Try it this way:
import csv

file1 = '/home/pedro/winter2020/20PY/dumpFiles/attendanceWinter2020_20PY.csv'
with open(file1) as fp:
    crdr = csv.reader(fp, delimiter = ',')
    for data1 in crdr:
        print(data1)
or if there is a header and you want a dictionary item:
import csv

file1 = '/home/pedro/winter2020/20PY/dumpFiles/attendanceWinter2020_20PY.csv'
with open(file1) as fp:
    crdr = csv.DictReader(fp, delimiter = ',')
    for row in crdr:
        for key, value in row:
            print(f"{key}: {value}")
Pedroski55 likes this post
Reply
#4
It is hard to help resolve your issue without at least a few lines of input from your file.

If you can paste two or three lines of your file into a reply (use the "Insert Output" button to surround the lines) we may be able to help you discover the issue.

Peter
Pedroski55 likes this post
Reply
#5
Thanks for the replies during the holiday, I certainly wasn't expecting a response yesterday!

I checked the csv file with .readlines() Each line ends with a newline character, not \r\n.

I just tried again in Windows and this is the output in Idle:

>>> import csv
>>> path1 = 'C:/Users/small/Desktop/attendanceWinter2020_20PY.csv'
>>> infile = open(path1)
>>> data1 = csv.reader(infile)
>>> mydatalist = []
>>> for record in data1:
    mydatalist.append(record)

    
>>> print('mydatalist is: ' + str(len(mydatalist)) + ' long')
mydatalist is: 26 long
>>> a_row = mydatalist[4]
>>> len(a_row)
1025
>>> infile = open(path1)
>>> data1 = csv.reader(infile, delimiter=',')
>>> mydatalist = []
>>> for record in data1:
    mydatalist.append(record)

    
>>> a_row = mydatalist[4]
>>> len(a_row)
1025
>>> a_row
['', '', '1921090147', '郜阳杰', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
 '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
 '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' (this goes on until the list is 1025 items long)
I can successfully extract the data I want.

But why is the row output 1025 long, when maximally 33 columns contain data?

This does not happen in Linux!
Reply
#6
Try open(path1, newline='\n') perhaps.
Reply
#7
Thanks for the example data line. When I try to run your original code (which opens the file in binary, mode "rb") in WSL Ubuntu Linux 20.04 under Win10-64 (python version 3.8.5) it gives this error:

Error:
Traceback (most recent call last): File "csvtest1.py", line 8, in <module> for record in data1: _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
I get the same error at a regular WIn-10 console prompt (my Windows python is also 3.8.5).

When I replace the Chinese utf-8 characters in your example with simple ASCII characters (I just used ABCD) and change the file open in your code to just "r" I get 1025 results, not 33.

Are there in fact 1025 commas in your input file? That implies a csv "data length" of 1025 columns, not 33. CSV reading (unless modified) will supply an empty field ("'', " in your example output) for every comma (or every occurrence of the "field separator" character that you set) which has no data preceding it.

Are you reading in binary in order to consume the multi-byte values of the Chinese characters in your example? If that is the case, what did you set the CSV field separator value to? (I know the python csv reader has an option to do that).

Perhaps the combination of reading in binary and supplying a csv field separator value of something other than comma is causing your difference?

Just guessing here. It seems we do not have enough information about your environment to completely figure out what is going on. What is your locale setting? What Linux distribution? Is there other code preceding your short example that sets the csv field separator and/or record separator values to other than the default values?

How are you getting the csv reader to read your data file when the file is opened in binary instead of text format?

If you could upload an extract of one record of your actual input file here as a "data.txt" file, perhaps a hex edit of that one record would tell us quite a lot.

Here is the ASCII-fied copy of your data file that I created, which has 1025 commas:

Output:
,,1921090147,ABCD,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
HTH

Peter
Reply
#8
Well, thanks for trying!

Don't spoil your holiday!

I opened the csv file with gedit, a Linux text editor, if you know it.

These are the contents of my .csv file. No line is 1025 items long!!:

Quote:,,,,max Attendance,,,Max Score,425,,,,50,25,,50,25,,50,25,,50,25,,50,25,,0,25,,0,25,
,,,23名,8,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Photo,phone,学号,姓名,出勤,出勤率,期末分数,得分率,得分,HW,CW,Week 16,HW,CW,Week 15,HW,CW,Week 14,HW,CW,Week 13,HW,CW,Week 12,HW,CW,Week 11,HW,CW,Week 10,HW,CW,Week 9
,,1921030146,聂林,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1921090147,郜阳杰,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1922190144,汪毓韬,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1923010312,张长乐,7,100,100,100,425,,,,50,25,1,50,25,1,50,25,1,50,25,1,50,25,1,0,25,1,0,25,1
,,1925010121,陈尚,4,57.1,47.6,41.2,175,,,,0,25,1,50,25,1,0,0,0,0,25,0,0,25,1,0,25,1,0,0,0
,,1926070127,蒲瑞杰,6,85.7,83.7,82.4,350,,,,50,25,1,50,25,1,50,25,1,50,25,1,0,25,1,0,25,1,0,0,0
,,1926070150,张旭,1,14.3,9.3,5.9,25,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,25,1,0,0,0,0,0,0
,,1926080140,陈子华,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,,1926080205,王天亮,5,71.4,56.8,47.1,200,,,,0,25,0,0,25,1,0,25,1,0,25,1,50,25,1,0,25,1,0,0,0
,,1926080209,闫怀壮,3,42.9,27.7,17.6,75,,,,0,0,0,0,0,0,0,0,0,0,25,1,0,25,1,0,25,1,0,0,0
,,1926080216,朱增援,6,85.7,84.4,83.5,355,,,,50,25,0,40,25,1,0,25,1,40,25,1,50,25,1,0,25,1,0,25,1
,,1926080230,杨澜,6,85.7,62.5,47.1,200,,,,0,25,1,0,25,1,0,25,1,0,25,1,50,25,1,0,25,1,0,0,0
,,1926080234,张建军,6,85.7,62.5,47.1,200,,,,0,25,1,0,25,1,0,25,1,0,25,1,50,25,1,0,25,1,0,0,0
,,1926080243,游杰,7,100,100,100,425,,,,50,25,1,50,25,1,50,25,1,50,25,1,50,25,1,0,25,1,0,25,1
,,1926080254,王润迪,6,85.7,62.5,47.1,200,,,,0,25,1,0,25,1,0,0,0,0,25,1,50,25,1,0,25,1,0,25,1
,,1927040125,黄楚兰,7,100,91.5,85.9,365,,,,50,25,1,50,25,1,0,25,1,40,25,1,50,25,1,0,25,1,0,25,1
,,1928040103,曹予曦,5,71.4,46.2,29.4,125,,,,0,0,0,0,25,1,0,0,0,0,25,1,0,25,1,0,25,1,0,25,1
,,1928040213,严昕,5,71.4,53.3,41.2,175,,,,0,25,1,0,25,1,0,25,1,0,25,1,50,25,1,0,0,0,0,0,0
,,1929020321,伍竹,5,71.4,46.2,29.4,125,,,,0,25,1,0,0,0,0,25,1,0,25,1,0,25,1,0,25,1,0,0,0
,,1929020325,张清源,6,85.7,55.5,35.3,150,,,,0,25,1,0,0,0,0,25,1,0,25,1,0,25,1,0,25,1,0,25,1
,,1929020329,章明慧,7,100,57.6,29.4,125,,,,0,25,1,0,25,1,0,0,1,0,25,1,0,25,1,0,25,1,0,0,1
,,1929030133,丁桃园,7,100,70.4,50.6,215,,,,0,25,1,40,25,1,0,25,1,0,25,1,0,25,1,0,25,1,0,25,1
,,1929050140,石楷,7,100,90.1,83.5,355,,,,50,25,1,0,25,1,50,25,1,50,25,1,30,25,1,0,25,1,0,25,1
Reply
#9
Microsoft Excel has a maximum of 255 columns. Importing a CSV record with more than 255 fields can cause some versions of Microsoft Excel to exhibit strange behavior. That is probably why you're only able to do a maximum of 255 fields with the OleDBDataAdapter.

Regards,
Caleb
Reply
#10
Using a copy of the file you pasted in your last message stored on my machine in "utf-8" encoding, this seems to work:

import csv

file1 = 'C:/Users/MyUserId/envtest/testutf.csv'
with open(file1, encoding="utf-8") as fp:
    crdr = csv.reader(fp, delimiter = ',')
    for data1 in crdr:
        print(data1)
Output:
['', '', '', '', 'max Attendance', '', '', 'Max Score', '425', '', '', '', '50', '25', '', '50', '25', '', '50', '25', '', '50', '25', '', '50', '25', '', '0', '25', '', '0', '25', ''] ['', '', '', '23名', '8', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] ['Photo', 'phone', '学号', '姓名', '出勤', '出勤率', '期末分数', '得分率', '得分', 'HW', 'CW', 'Week 16', 'HW', 'CW', 'Week 15', 'HW', 'CW', 'Week 14', 'HW', 'CW', 'Week 13', 'HW', 'CW', 'Week 12', 'HW', 'CW', 'Week 11', 'HW', 'CW', 'Week 10', 'HW', 'CW', 'Week 9'] ['', '', '1921030146', '聂林', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] ['', '', '1921090147', '郜阳杰', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] ['', '', '1922190144', '汪毓韬', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] ['', '', '1923010312', '张长乐', '7', '100', '100', '100', '425', '', '', '', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1925010121', '陈尚', '4', '57.1', '47.6', '41.2', '175', '', '', '', '0', '25', '1', '50', '25', '1', '0', '0', '0', '0', '25', '0', '0', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926070127', '蒲瑞杰', '6', '85.7', '83.7', '82.4', '350', '', '', '', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926070150', '张旭', '1', '14.3', '9.3', '5.9', '25', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '25', '1', '0', '0', '0', '0', '0', '0'] ['', '', '1926080140', '陈子华', '0', '0', '0', '0', '0', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] ['', '', '1926080205', '王天亮', '5', '71.4', '56.8', '47.1', '200', '', '', '', '0', '25', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '50', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926080209', '闫怀壮', '3', '42.9', '27.7', '17.6', '75', '', '', '', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926080216', '朱增援', '6', '85.7', '84.4', '83.5', '355', '', '', '', '50', '25', '0', '40', '25', '1', '0', '25', '1', '40', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1926080230', '杨澜', '6', '85.7', '62.5', '47.1', '200', '', '', '', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '50', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926080234', '张建军', '6', '85.7', '62.5', '47.1', '200', '', '', '', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '50', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1926080243', '游杰', '7', '100', '100', '100', '425', '', '', '', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1926080254', '王润迪', '6', '85.7', '62.5', '47.1', '200', '', '', '', '0', '25', '1', '0', '25', '1', '0', '0', '0', '0', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1927040125', '黄楚兰', '7', '100', '91.5', '85.9', '365', '', '', '', '50', '25', '1', '50', '25', '1', '0', '25', '1', '40', '25', '1', '50', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1928040103', '曹予曦', '5', '71.4', '46.2', '29.4', '125', '', '', '', '0', '0', '0', '0', '25', '1', '0', '0', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1928040213', '严昕', '5', '71.4', '53.3', '41.2', '175', '', '', '', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '50', '25', '1', '0', '0', '0', '0', '0', '0'] ['', '', '1929020321', '伍竹', '5', '71.4', '46.2', '29.4', '125', '', '', '', '0', '25', '1', '0', '0', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '0', '0'] ['', '', '1929020325', '张清源', '6', '85.7', '55.5', '35.3', '150', '', '', '', '0', '25', '1', '0', '0', '0', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1929020329', '章明慧', '7', '100', '57.6', '29.4', '125', '', '', '', '0', '25', '1', '0', '25', '1', '0', '0', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '0', '1'] ['', '', '1929030133', '丁桃园', '7', '100', '70.4', '50.6', '215', '', '', '', '0', '25', '1', '40', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1', '0', '25', '1'] ['', '', '1929050140', '石楷', '7', '100', '90.1', '83.5', '355', '', '', '', '50', '25', '1', '0', '25', '1', '50', '25', '1', '50', '25', '1', '30', '25', '1', '0', '25', '1', '0', '25', '1']
I hope that helps you.

Peter
snippsat likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Read complete windows registry? fredep57 3 901 Mar-15-2023, 08:14 PM
Last Post: buran
  Read directory listing of files and parse out the highest number? cubangt 5 2,252 Sep-28-2022, 10:15 PM
Last Post: Larz60+
  xml simple reader kucingkembar 2 1,022 Aug-19-2022, 08:51 PM
Last Post: kucingkembar
  Having strange results from an RFID HID card reader - I'm stuck orbisnz 1 1,431 Mar-28-2022, 08:20 AM
Last Post: Larz60+
  Thoughts on interfacing with a QR code reader that outputs keystrokes? wrybread 1 1,448 Oct-08-2021, 03:44 PM
Last Post: bowlofred
  Remove Specific Columns when the number of columns is greater than a specific value CuriousOne 0 1,269 Sep-09-2021, 09:17 PM
Last Post: CuriousOne
  NFC reader code help johnroberts2k 1 2,518 Jul-02-2021, 08:43 PM
Last Post: deanhystad
  [Solved] Using readlines to read data file and sum columns Laplace12 4 3,464 Jun-16-2021, 12:46 PM
Last Post: Laplace12
  Read strings and numbers in columns from a file suvadip 4 2,781 Aug-11-2020, 09:37 PM
Last Post: suvadip
  Closing Files - CSV Reader/Writer lummers 2 2,564 May-28-2020, 06:36 AM
Last Post: Knight18

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020