Python Forum
g Null Byte using DictReader - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: g Null Byte using DictReader (/thread-18377.html)

Pages: 1 2


g Null Byte using DictReader - eshwinsukhdeve - May-15-2019

I have the below code
stream = io.StringIO(csv_file.stream.read().decode('utf-8-sig'), newline=None) // error is here

reader = csv.DictReader(stream)

list_of_entity = []
line_no, prev_len = 1, 0,

for line in reader:
While executing the above code I got the below error. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 252862: invalid start byte

Later to fix this I tried the below.
stream = io.StringIO(csv_file.stream.read().decode('unicode_escape'), newline=None)

reader = csv.DictReader(stream)

list_of_entity = []
line_no, prev_len = 1, 0,

for line in reader:// error is here
when i change decode as unicode_escape it thrown the error "_csv.Error: line contains NULL byte" at above highlighted comment line.

There is null byte present in csv, I want to ignore or replace it. can anyone help on this.


RE: g Null Byte using DictReader - Larz60+ - May-15-2019

Have you tried plain utf-8 ?


RE: g Null Byte using DictReader - eshwinsukhdeve - May-15-2019

Hi Larz60+
I tried with utf-8 also but still same error.

_csv.Error: line contains NULL byte


RE: g Null Byte using DictReader - buran - May-15-2019

(May-15-2019, 05:08 AM)eshwinsukhdeve Wrote: I want to ignore or replace it
pass errors argument with value 'ignore' or 'replace' to decode()

the docs


RE: g Null Byte using DictReader - eshwinsukhdeve - May-15-2019

can you tell me in code how to pass it error argument'ignore' or 'replace' here


RE: g Null Byte using DictReader - buran - May-15-2019

stream = io.StringIO(csv_file.stream.read().decode('utf-8-sig', error='ignore'), newline=None)

and you can also try chardet - https://pypi.org/project/chardet/
for help on what the encoding is


RE: g Null Byte using DictReader - eshwinsukhdeve - May-15-2019

it says,

TypeError: 'error' is an invalid keyword argument for this function


RE: g Null Byte using DictReader - buran - May-15-2019

sorry, it's errors, not error - my bad
stream = io.StringIO(csv_file.stream.read().decode('utf-8-sig', errors='ignore'), newline=None)



RE: g Null Byte using DictReader - eshwinsukhdeve - May-15-2019

Still getting same error.
csv.Error: line contains NULL byte

the main issue is in below line of code
for line in reader:
reader object contains null bytes and throwing error here


RE: g Null Byte using DictReader - buran - May-15-2019

(May-15-2019, 05:08 AM)eshwinsukhdeve Wrote: stream = io.StringIO(csv_file.stream.read().decode('utf-8-sig'), newline=None) // error is here
(May-15-2019, 05:08 AM)eshwinsukhdeve Wrote: While executing the above code I got the below error. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 252862: invalid start byte

My suggestion should fix this error produced by this code