Feb-13-2023, 11:21 AM
(Feb-12-2023, 06:42 PM)jefsummers Wrote: I'd give a thought to a Pandas alternative. There are several, and when you run into a pandas limitation (or speed issue) take a look.
Polars - listen to the recent Talk Python To Me Podcast for some details (episode 402)
Vaex - supports up to a billion rows
Dask
PySpark - Python wrapper for Spark which is written in scala, supports large datasets and distributed computing.
Thanks for the feedback.
I'd already started coding up putting the data into a database and reading the data from there, but I will definitely look into your suggestions to see if they can do what I want in the future.