Python Forum
Read/Sort Large text file avoiding line-by-line read using mmep or hdf5
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read/Sort Large text file avoiding line-by-line read using mmep or hdf5

I have a large data file (N,4) which I am mapping line-by-line. My files are 10 GBs, a simplistic implementation is given below. Though the following works, it takes huge amount of time.

I would like to implement this logic such that the text file is read directly and I can access the elements. Thereafter, I need to sort the whole (mapped) file based on column-2 elements.

The examples I see online assumes smaller piece of data (d) and using f[:] = d[:]but I can't do that since d is huge in my case and eats my RAM.

PS: I know how to load the file using np.loadtxt and sort them using argsort, but that logic fails (memory error) for GB file size. Would appreciate any direction.

nrows, ncols = 20000000, 4 
f = np.memmap('memmapped.dat', dtype=np.float32,
              mode='w+', shape=(nrows, ncols))

filename = "my_file.txt"

with open(filename) as file:

    for i, line in enumerate(file):
        floats = [float(x) for x in line.split(',')]
        f[i, :] = floats
del f

Possibly Related Threads…
Thread Author Replies Views Last Post
  Read csv file through PyCharm kimx0961 3 113 Yesterday, 07:05 PM
Last Post: perfringo
  Accessing varying command line arguements Rakshan 3 148 Jul-28-2021, 03:18 PM
Last Post: snippsat
  UART Serial Read & Write to MP3 Player Doesn't Work bill_z 15 557 Jul-17-2021, 04:19 PM
Last Post: bill_z
  How do I read in a Formula in Excel and convert it to do the computation in Python? JaneTan 2 184 Jul-07-2021, 02:06 PM
Last Post: Marbelous
  Open and read multiple text files and match words kozaizsvemira 3 3,812 Jul-07-2021, 11:27 AM
Last Post: Larz60+
  Why it does not print( Rejaul84 1 246 Jul-01-2021, 10:37 PM
Last Post: bowlofred
  How to capture string from a line to certain line jerald 1 238 Jun-30-2021, 05:13 PM
Last Post: Larz60+
  Read and write active Excel file euras 4 370 Jun-29-2021, 11:16 PM
Last Post: Pedroski55
  [Solved] Reading every nth line into a column from txt file Laplace12 7 478 Jun-29-2021, 09:17 AM
Last Post: Laplace12
  [solved] unexpected character after line continuation character paul18fr 4 318 Jun-22-2021, 03:22 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020