Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Is there a way to save a CSV file as a python object
#1
Hi all,

I was wondering if there is a way to save a CSV file as a python object in order to be read in to python quicker. The reason I am cocnerned about this is is that some of the CSV files i hold contain many thousands (10's to 100's of columns) plus many rows of features, ~30,000.

Now having the CSV file itself is not an issue.. but its bulky and takes some time to read in to python.

In r, these can be saved as an rds or r object.. which although they hold the same CSV file, the load time in to R is almost instant. Reading in the CSV file itself in to R takes a good 20-30 seconds.

Is there an equivalent for this in python as I will be reading in a lot of CSV files in to python, so having an object with CSV data would be useful if it means reading in is much faster.

Any advice would be appreciated. Thank you!
Amir
Quote
#2
The question is too ambiguous to answer: "its bulky and takes some time to read in to python".

What specifically is bulky and what time performance is considered satisfactory? And the main question: into what datastructure you read the file and what would you do with aquired data.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Life of Brian: Conjugate the verb, "to go" !
Quote
#3
(Jul-16-2019, 11:12 AM)perfringo Wrote: The question is too ambiguous to answer: "its bulky and takes some time to read in to python". What specifically is bulky and what time performance is considered satisfactory? And the main question: into what datastructure you read the file and what would you do with aquired data.

the file size increases into a file that takes time to read in to python. Based on experience in R, reading an R object in to R is near instant and that is what i would consider satisfactory.

The object would be able to hold the table and the downstream processes would be manipulation of the data for later analysis of the dataframe. a pandas dataframe is the ultimate goal. which again, is absolutely fine reading in the CSV file itself. I was jsut wondering if there was a known python style object that holds the csv file...
Quote
#4
(Jul-16-2019, 11:23 AM)amjass12 Wrote: Based on experience in R, reading an R object in to R is near instant and that is what i would consider satisfactory.

I read csv files with 200K rows into dictionary and it's 'near instant'. Access to dictionary keys is instant (O(1)).
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Life of Brian: Conjugate the verb, "to go" !
Quote
#5
(Jul-16-2019, 11:48 AM)perfringo Wrote:
(Jul-16-2019, 11:23 AM)amjass12 Wrote: Based on experience in R, reading an R object in to R is near instant and that is what i would consider satisfactory.
I read csv files with 200K rows into dictionary and it's 'near instant'. Access to dictionary keys is instant (O(1)).

ok perfect! this was the kind of information i was looking for! if this is the quickest way to do it in python then great. indeed it is significantly faster in python than in R but i was thinking about the long term where i will be getting data sets of a similar size ~200k. they are essentially counts tables... in R these take a long time to read in!

thanks for the info! i shall continue reading in CSV files..
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  save my sensor data from the bme680 into a json or csv file Plastefuchs84 1 247 Aug-23-2019, 03:04 AM
Last Post: Plastefuchs84
  save 2d array to file and load back ian 3 4,873 May-18-2018, 05:00 AM
Last Post: scidam

Forum Jump:


Users browsing this thread: 1 Guest(s)