Python Forum
Problem with importing a CSV file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem with importing a CSV file
#7
UnicodeDecodeError occurs, if the source file can't be decoded from utf8, which is the default encoding.
The function pd.read_csv does not seem to have a kwarg to ignore encoding errors.

One way could be to open the file in TextMode and pass the fd to pandas.
with open("G:\\Analyser\\2019 OS\\test.csv", errors='ignore' ) as fd:
    data = pd.read_csv(fd, header=None, error_bad_lines=False)
Take a look into the documentation about pd.read_csv.

This is the constructor:
 pandas.read_csv(filepath_or_buffer, sep=', ', delimiter=None, header='infer', names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, iterator=False, chunksize=None, compression='infer', thousands=None, decimal=b'.', lineterminator=None, quotechar='"', quoting=0, escapechar=None, comment=None, encoding=None, dialect=None, tupleize_cols=None, error_bad_lines=True, warn_bad_lines=True, skipfooter=0, doublequote=True, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None)[source]
The first argument filepath_or_buffer is described as:
Quote:filepath_or_buffer : str, pathlib.Path, py._path.local.LocalPath or any \

object with a read() method (such as a file handle or StringIO)

The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. For instance, a local file could be file://localhost/path/to/table.csv

I haven't tested the upper example, but it should work. In this case errors are ignored.
I guess the file you have, is in a different encoding as utf8.
It could be:
  • latin1 (ISO/IEC 8859-1)
  • latin9 (ISO/IEC 8859-15)
  • Windows-1252 (CP 1252 / (Western European) / ANSI)

There is also a modules called ftfy which can solve bad encoding errors.

import ftfy


with open('file_with_bad_encoding.txt', errors='ignore') src:
    fixed_text = ftfy.fix_text(src.read())
with open('file_with_fixed_encoding.txt', 'w') as dst:
    dst.write(fixed_text)
After this, the file is using utf8 as encoding and the most errors from wrong encoding/decoding should be fixed.
To know the right encoding of an input file is better.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Messages In This Thread
Problem with importing a CSV file - by Chopan2211 - Nov-05-2019, 02:19 PM
RE: Problem with importing a CSV file - by buran - Nov-05-2019, 02:23 PM
RE: Problem with importing a CSV file - by buran - Nov-05-2019, 04:46 PM
RE: Problem with importing a CSV file - by DeaD_EyE - Nov-06-2019, 09:25 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  New2Python: Help with Importing/Mapping Image Src to Image Code in File CluelessITguy 0 729 Nov-17-2022, 04:46 PM
Last Post: CluelessITguy
  Problem with importing python-telegram library into the project gandonio 1 1,583 Nov-01-2022, 02:19 AM
Last Post: deanhystad
  Problem with importing Python file in Visual Studio Code DXav 7 5,130 Jun-15-2022, 12:54 PM
Last Post: snippsat
  importing functions from a separate python file in a separate directory Scordomaniac 3 1,388 May-17-2022, 07:49 AM
Last Post: Pedroski55
  Importing a function from another file runs the old lines also dedesssse 6 2,574 Jul-06-2021, 07:04 PM
Last Post: deanhystad
  Importing text file into excel spreadsheet with formatting david_dsmn 1 3,634 Apr-05-2021, 10:21 PM
Last Post: david_dsmn
  importing a CSV file into Python russoj5 1 2,968 Aug-02-2020, 12:03 AM
Last Post: scidam
  Importing data from a text file into an SQLite database with Python macieju1974 7 4,148 Jun-29-2020, 08:51 PM
Last Post: buran
  importing CSV file into a OOP Class table using Python faruk61 1 2,971 Apr-15-2020, 12:00 PM
Last Post: faruk61
  importing CSV file into a HTML table using Python trybakov 1 2,302 Feb-22-2020, 09:47 PM
Last Post: scidam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020