Python Forum

Full Version: Please, advise collections for my task
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Python 3.7.3

In my task, it is necessary to store in memory a very large quantity of records (~ 10,000,000 elements), each of which is on the order of 5-10 fields of different types (bool, string, integer, fixed-point number, date and time). It is necessary to be economical not only with memory, but also with processor time (otherwise I won’t wait for the processing to complete).

Initially, the data is stored in csv- files, which must be read by filling out the collections (for further processing).

Now I thing to use a NumPy- structured arrays (in my old C- program, I successfully used vector of structures).

What choice could you offer and why?
Why don't you try it out? Just couple of lines of code and you have initial idea what it will take. I have experience with smaller files (~200K lines) and the time consumed was so tiny that I had to measure it separately (it was ~0.2 sec). I read the file line by line and built nested dictionary along the way from data on rows for fast lookup later.