Jul-30-2020, 04:02 PM
I am querying a sql table with about 3.5 MM rows and 400 columns - I am estimating it is less than 5 GB in size.
If I use pd.read_sql to select * from this table, it takes many, many hours to execute. (Over 10 hours sometimes....)
I would like to read in the entire table and not cut it into pieces....I know folks will suggest otherwise, I get that. But I am still looking for a better solution that can handle reading in the entire table. We are not talking about terabytes here...
Also, even if I use chunks, the sum of the time the individual chunks take to process is similar amount of time to execute the entire table query.
Is there any alternative approach that would work faster?
Thanks,
Brian
If I use pd.read_sql to select * from this table, it takes many, many hours to execute. (Over 10 hours sometimes....)
I would like to read in the entire table and not cut it into pieces....I know folks will suggest otherwise, I get that. But I am still looking for a better solution that can handle reading in the entire table. We are not talking about terabytes here...
Also, even if I use chunks, the sum of the time the individual chunks take to process is similar amount of time to execute the entire table query.
Is there any alternative approach that would work faster?
Thanks,
Brian