Jan-17-2018, 11:40 AM
(This post was last modified: Jan-17-2018, 11:42 AM by Gribouillis.)
If there are only 40000 records, I'm not sure it can be called a big file. Did you try loading the whole file in memory to see if python can manipulate the data as a whole? Also note that you may need to load only a few columns for the main work of sorting the lines.
Nowadays, everybody seems to be using the pandas library to handle tabular data. I don't know this library, but it is probably the first thing you could check: try to load your data with pandas.
There is also a classical and very mature library named pytables that can manage the storage of very large amounts of data. It can also be a much more comfortable alternative than using an sql database. It may have nice sorting capabilities too.
Nowadays, everybody seems to be using the pandas library to handle tabular data. I don't know this library, but it is probably the first thing you could check: try to load your data with pandas.
There is also a classical and very mature library named pytables that can manage the storage of very large amounts of data. It can also be a much more comfortable alternative than using an sql database. It may have nice sorting capabilities too.