Python Forum

Full Version: g Null Byte using DictReader
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I have the below code
stream = io.StringIO(csv_file.stream.read().decode('utf-8-sig'), newline=None) // error is here

reader = csv.DictReader(stream)

list_of_entity = []
line_no, prev_len = 1, 0,

for line in reader:
While executing the above code I got the below error. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 252862: invalid start byte

Later to fix this I tried the below.
stream = io.StringIO(csv_file.stream.read().decode('unicode_escape'), newline=None)

reader = csv.DictReader(stream)

list_of_entity = []
line_no, prev_len = 1, 0,

for line in reader:// error is here
when i change decode as unicode_escape it thrown the error "_csv.Error: line contains NULL byte" at above highlighted comment line.

There is null byte present in csv, I want to ignore or replace it. can anyone help on this.
Have you tried plain utf-8 ?
Hi Larz60+
I tried with utf-8 also but still same error.

_csv.Error: line contains NULL byte
(May-15-2019, 05:08 AM)eshwinsukhdeve Wrote: [ -> ]I want to ignore or replace it
pass errors argument with value 'ignore' or 'replace' to decode()

the docs
can you tell me in code how to pass it error argument'ignore' or 'replace' here
stream = io.StringIO(csv_file.stream.read().decode('utf-8-sig', error='ignore'), newline=None)

and you can also try chardet - https://pypi.org/project/chardet/
for help on what the encoding is
it says,

TypeError: 'error' is an invalid keyword argument for this function
sorry, it's errors, not error - my bad
stream = io.StringIO(csv_file.stream.read().decode('utf-8-sig', errors='ignore'), newline=None)
Still getting same error.
csv.Error: line contains NULL byte

the main issue is in below line of code
for line in reader:
reader object contains null bytes and throwing error here
(May-15-2019, 05:08 AM)eshwinsukhdeve Wrote: [ -> ]stream = io.StringIO(csv_file.stream.read().decode('utf-8-sig'), newline=None) // error is here
(May-15-2019, 05:08 AM)eshwinsukhdeve Wrote: [ -> ]While executing the above code I got the below error. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 252862: invalid start byte

My suggestion should fix this error produced by this code
Pages: 1 2