May-16-2018, 03:09 PM
(May-16-2018, 01:46 PM)garikhgh0 Wrote: If I found such errors in my file, how to handle them?What are using and how do you read file?
\xa0
is non-breaking space in Latin1 (ISO 8859-1).To read it with utf-8 you can just ignore error as posted before.
>>> s = b'"2016-02-01","Htech 6605 Wired\xa0 selfie HV-BTM 11","Accessories","Dalma RSC","1","Sales"\r\n' >>> print(s) b'"2016-02-01","Htech 6605 Wired\xa0 selfie HV-BTM 11","Accessories","Dalma RSC","1","Sales"\r\n' >>> type(s) <class 'bytes'> >>> print(s.decode('utf-8', errors='ignore')) "2016-02-01","Htech 6605 Wired selfie HV-BTM 11","Accessories","Dalma RSC","1","Sales" >>> # latin-1 will work >>> print(s.decode('latin-1')) "2016-02-01","Htech 6605 Wired selfie HV-BTM 11","Accessories","Dalma RSC","1","Sales"Can do replace then decode utf-8.
>>> s = b'"2016-02-01","Htech 6605 Wired\xa0 selfie HV-BTM 11","Accessories","Dalma RSC","1","Sales"\r\n' >>> new_string = s.replace(b'\xa0', b'') >>> print(new_string.decode('utf-8')) "2016-02-01","Htech 6605 Wired selfie HV-BTM 11","Accessories","Dalma RSC","1","Sales"