Good morning,
I have read the rules about posting but i cannot attach a sample of my data or reproduce the entire error message as the data i am working on is located on a server without access to internet. I apologise for this inconvenience. I'll try to reproduce most of what is requested however below.
I am working with very big sas files (data on each job, hence millions of lines) and got memory error when i was trying to simple read them (they open fine in R or stata strangely). Therefore i searched and find the pandas.read_sas option to work with chunks of the data. My code is now the following:
At this point i get the following error (I am reproducing it here manually as i cannot copy paste):
Axelle
I have read the rules about posting but i cannot attach a sample of my data or reproduce the entire error message as the data i am working on is located on a server without access to internet. I apologise for this inconvenience. I'll try to reproduce most of what is requested however below.
I am working with very big sas files (data on each job, hence millions of lines) and got memory error when i was trying to simple read them (they open fine in R or stata strangely). Therefore i searched and find the pandas.read_sas option to work with chunks of the data. My code is now the following:
df_chunk = pd.read_sas(r'file.sas7bdat', chunksize=500) for chunk in df_chunk: chunk_list.append(chunk)
At this point i get the following error (I am reproducing it here manually as i cannot copy paste):
Error:line 660, in _chunk_to_dataframe
if self.column_formats[j] in const.sas_date_formats:
IndexError: list index out of range
Looking deeper in the error message, the issue seems to be in the underlying function "_chunk_to_dataframe(self)" in the following line :
if self.column_formats[j] in const.sas_date_formatsMany thanks for your help,
Axelle