Pandas CSV Issues - Printable Version

Pandas CSV Issues - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Pandas CSV Issues (/thread-38703.html)

Pandas CSV Issues - rmfooty - Nov-14-2022

Hi all, and thank you for your help in advance.

I am experiencing a strange issue with Pandas. I have a program that reads an entire CSV into a dataframe, changes the contents of the dataframe and writes the whole CSV back to disk. Unfortunately when the CSV or dataframe reaches 365 records, it either reads the first 365 records or write the first 365 records. Cry

Has anyone else had an issue with this?

INSTALLED VERSIONS
------------------
commit : None
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.13.0-1022-aws
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 0.25.3
numpy : 1.17.4
pytz : 2021.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 52.0.0
Cython : None
pytest : 4.6.9
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 3.0.2
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : None
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.5.0
matplotlib : 3.1.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.3
sqlalchemy : None
tables : 3.6.1
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
xlsxwriter : 3.0.2

RE: Pandas CSV Issues - deanhystad - Nov-14-2022

Quote:Unfortunately when the CSV or dataframe reaches 365 records, it either reads the first 365 records or write the first 365 records

I assume you mean when the dataframe is larger than 365 records. Correct?

I think the problem is elsewhere. 365 rows (records?) is pretty small for pandas.

This program creates a dataframe with 1000 rows, each row having 1000 columns. It writes just fine to a csv file, and when I read it back in, the new dataframe equals the old.

import pandas as pd

data = {str(i):list(range(1000)) for i in range(1000)}
df = pd.DataFrame(data)
df.to_csv("test.csv", index=False)

df2 = pd.read_csv("test.csv")
print(df2.equals(df))

Output:
True

What have you done to try and debug this problem? 365 is a really interesting number. I can think of all kinds of code I could write that would stop working at 365, but the fault would be mine, not pandas. Can you post the code that is writing your csv file?

RE: Pandas CSV Issues - rmfooty - Nov-16-2022

Hi Dean.

Thank you so much for your help. I have created a loads of records and found out that the issue has arisen from having two programs accessing the CSV.

Thanks again.