Dec-27-2020, 05:39 AM
(This post was last modified: Dec-27-2020, 01:05 PM by Larz60+.
Edit Reason: fixed bbcode tags
)
Thanks for the example data line. When I try to run your original code (which opens the file in binary, mode "rb") in WSL Ubuntu Linux 20.04 under Win10-64 (python version 3.8.5) it gives this error:
When I replace the Chinese utf-8 characters in your example with simple ASCII characters (I just used ABCD) and change the file open in your code to just "r" I get 1025 results, not 33.
Are there in fact 1025 commas in your input file? That implies a csv "data length" of 1025 columns, not 33. CSV reading (unless modified) will supply an empty field ("'', " in your example output) for every comma (or every occurrence of the "field separator" character that you set) which has no data preceding it.
Are you reading in binary in order to consume the multi-byte values of the Chinese characters in your example? If that is the case, what did you set the CSV field separator value to? (I know the python csv reader has an option to do that).
Perhaps the combination of reading in binary and supplying a csv field separator value of something other than comma is causing your difference?
Just guessing here. It seems we do not have enough information about your environment to completely figure out what is going on. What is your locale setting? What Linux distribution? Is there other code preceding your short example that sets the csv field separator and/or record separator values to other than the default values?
How are you getting the csv reader to read your data file when the file is opened in binary instead of text format?
If you could upload an extract of one record of your actual input file here as a "data.txt" file, perhaps a hex edit of that one record would tell us quite a lot.
Here is the ASCII-fied copy of your data file that I created, which has 1025 commas:
Peter
Error:Traceback (most recent call last):
File "csvtest1.py", line 8, in <module>
for record in data1:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
I get the same error at a regular WIn-10 console prompt (my Windows python is also 3.8.5).When I replace the Chinese utf-8 characters in your example with simple ASCII characters (I just used ABCD) and change the file open in your code to just "r" I get 1025 results, not 33.
Are there in fact 1025 commas in your input file? That implies a csv "data length" of 1025 columns, not 33. CSV reading (unless modified) will supply an empty field ("'', " in your example output) for every comma (or every occurrence of the "field separator" character that you set) which has no data preceding it.
Are you reading in binary in order to consume the multi-byte values of the Chinese characters in your example? If that is the case, what did you set the CSV field separator value to? (I know the python csv reader has an option to do that).
Perhaps the combination of reading in binary and supplying a csv field separator value of something other than comma is causing your difference?
Just guessing here. It seems we do not have enough information about your environment to completely figure out what is going on. What is your locale setting? What Linux distribution? Is there other code preceding your short example that sets the csv field separator and/or record separator values to other than the default values?
How are you getting the csv reader to read your data file when the file is opened in binary instead of text format?
If you could upload an extract of one record of your actual input file here as a "data.txt" file, perhaps a hex edit of that one record would tell us quite a lot.
Here is the ASCII-fied copy of your data file that I created, which has 1025 commas:
Output:,,1921090147,ABCD,0,0,0,0,0,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
HTHPeter