Python Forum
Read/Sort Large text file avoiding line-by-line read using mmep or hdf5
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read/Sort Large text file avoiding line-by-line read using mmep or hdf5
#1
Hello,

I have a large data file (N,4) which I am mapping line-by-line. My files are 10 GBs, a simplistic implementation is given below. Though the following works, it takes huge amount of time.

I would like to implement this logic such that the text file is read directly and I can access the elements. Thereafter, I need to sort the whole (mapped) file based on column-2 elements.

The examples I see online assumes smaller piece of data (d) and using f[:] = d[:]but I can't do that since d is huge in my case and eats my RAM.

PS: I know how to load the file using np.loadtxt and sort them using argsort, but that logic fails (memory error) for GB file size. Would appreciate any direction.

nrows, ncols = 20000000, 4 
f = np.memmap('memmapped.dat', dtype=np.float32,
              mode='w+', shape=(nrows, ncols))

filename = "my_file.txt"

with open(filename) as file:

    for i, line in enumerate(file):
        floats = [float(x) for x in line.split(',')]
        f[i, :] = floats
del f
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Intuit QB Line )))))) What is the contact number of Quickbooks Payroll Support Numbe aliajoseph275 0 16 20 minutes ago
Last Post: aliajoseph275
  Βad Input on line 12 Azdaghost 5 1,306 Apr-19-2025, 10:22 PM
Last Post: Azdaghost
Question [SOLVED] [Beautiful Soup] Move line to top in HTML head? Winfried 0 299 Apr-13-2025, 05:50 AM
Last Post: Winfried
  Insert command line in script lif 4 1,025 Mar-24-2025, 10:30 PM
Last Post: lif
  Entry field random pull from list, each return own line Bear1981 6 842 Feb-25-2025, 06:09 AM
Last Post: Pedroski55
  How to revert back to a previous line from user input Sharkenn64u 2 990 Dec-28-2024, 08:02 AM
Last Post: Pedroski55
  Problems writing a large text file in python Vilius 4 1,057 Dec-21-2024, 09:20 AM
Last Post: Pedroski55
  How to read a file as binary or hex "string" so that I can do regex search? tatahuft 3 1,188 Dec-19-2024, 11:57 AM
Last Post: snippsat
  python read PDF Statement and write it into excel mg24 1 1,016 Sep-22-2024, 11:42 AM
Last Post: Pedroski55
  Read TXT file in Pandas and save to Parquet zinho 2 1,303 Sep-15-2024, 06:14 PM
Last Post: zinho

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020