Python Forum
millions rows and 40 columns
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
millions rows and 40 columns
#1
hello. I want to ask a question.
I use python in analitycs, but sometimes I must hundle datas with millions rows and 40 columns. When uploading data, my computer is loading. what kind of solution is available. where from to get info.


thanks in advance
Reply
#2
The best way is to load the data in a line at a time and save into a reliable database (Personally, I'd use PostgreSQL).
This may only work if it's not time critical.
In that case, you need to incrementally read the file line by line, and build
a hashed index (or SQL index) of:
hash code, key and file location (which you can get with filepointer.tell())
then to access a specific record, first look it up (using hash code) in the index, and seek the record from the file pointer.
Reply
#3
files with billions of rows are more fun.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#4
They are (were? I've been away from it for quite a while now) common in telecommunications call record processing.
Reply
#5
i've seen some data hit over a trillion records. and this was back when you needed a few RAID arrays to hold it (that place had 60+ of them).
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020