Loading large .csv file with pandas - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Loading large .csv file with pandas (/thread-27457.html) |
Loading large .csv file with pandas - hangejj - Jun-07-2020 Hello, I am trying to load a .csv file that has around 3 million lines. This file has a delimiter that is "|" due to commas are in the needed data so I didn't want to use a comma as the delimiter. I'm running into issues getting the data to load so I can clean it up to be able to use for SQL. Spreadsheet programs in general I've seen/known can handle little over 1 million so I have a few questions. Please note I'm looking for guidance not someone to do the work for me. 1. Since I'm using a different delimiter than the file type, would it be better to save the file as a .txt file? 2. Currently the code examples below that I've tried, I'm getting a tokenizing error. import pandas as pd csv="/home/file.csv" c_size = 500 for chunk in pd.read_csv(csv,chunksize=c_size): print(chunk) Quote:Quote:ParserError: Error tokenizing data. C error: Expected 3 fields in line 94909, saw 4 import pandas as pd csv="/home/joe/study.csv" c_size = 500 for chunk in pd.read_csv(csv,chunksize=c_size): print(chunk.shape) Quote:ParserError: Error tokenizing data. C error: Expected 3 fields in line 94909, saw 4 RE: Loading large .csv file with pandas - ndc85430 - Jun-07-2020 (Jun-07-2020, 03:56 PM)hangejj Wrote: 1. Since I'm using a different delimiter than the file type, would it be better to save the file as a .txt file? No, at least on Unix, file extensions aren't particularly meaningful. The read_csv function has a parameter that lets you specify the delimiter. See the docs here.
RE: Loading large .csv file with pandas - hangejj - Jun-08-2020 (Jun-07-2020, 04:08 PM)ndc85430 Wrote:(Jun-07-2020, 03:56 PM)hangejj Wrote: 1. Since I'm using a different delimiter than the file type, would it be better to save the file as a .txt file? Thank you. Once I solve this I'll put the solution in case anyone else comes across this. |