Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Iterating Large Files
#2
Before going to memory mapping, which may not change much because the problem doesn't seem to be the time spent accessing the files, it seems to me that the mathematical algorithm could be improved by first sorting the arrays of numbers and using 3 pointers:
  • The first pointer is the current element x[i] in the first array of numbers.
  • The second pointer is the first element y[j] in the second array such that y[j] >= x[i] - 4000
  • The third pointer is the first element y[k] in the second array such that y[k] > x[i] + 4000
All the differences x[i] - y[n] must be appended to the result for n in range(j, k).

As i increases, j and k also increase, their new values could be determined by binary search in the second array.

Depending on your data, these algorithmic changes could dramatically cut the numbers of python loops and the number of numerical tests in the execution of the program.
Reply


Messages In This Thread
Iterating Large Files - by Robotguy - Jun-25-2020, 10:46 PM
RE: Iterating Large Files - by Gribouillis - Jun-26-2020, 10:00 AM
RE: Iterating Large Files - by Robotguy - Jul-15-2020, 08:54 PM
RE: Iterating Large Files - by Gribouillis - Jul-15-2020, 11:01 PM
RE: Iterating Large Files - by Robotguy - Jul-17-2020, 04:23 PM
RE: Iterating Large Files - by Gribouillis - Jul-16-2020, 07:11 AM
RE: Iterating Large Files - by Gribouillis - Jul-17-2020, 07:41 PM
RE: Iterating Large Files - by Robotguy - Jul-22-2020, 03:23 PM
RE: Iterating Large Files - by Gribouillis - Jul-22-2020, 06:09 PM
RE: Iterating Large Files - by Robotguy - Jul-22-2020, 08:46 PM
RE: Iterating Large Files - by Gribouillis - Jul-22-2020, 09:13 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Iterate 2 large text files across lines and replace lines in second file medatib531 13 6,107 Aug-10-2020, 11:01 PM
Last Post: medatib531
  Handling Large XML Files (>10GB) in Python onlydibs 1 4,265 Dec-22-2019, 05:46 AM
Last Post: Clunk_Head
  Segmentation fault with large files kusal1 3 2,833 Oct-01-2019, 07:32 AM
Last Post: Gribouillis
  Compare two large CSV files for a match Python_Newbie9 3 5,864 Apr-22-2019, 08:49 PM
Last Post: ichabod801
  Comparing values in large txt files StevenVF 2 2,794 Feb-28-2019, 09:07 AM
Last Post: StevenVF
  Download multiple large json files at once halcynthis 0 2,826 Feb-14-2019, 08:41 AM
Last Post: halcynthis
  iterating over files clarablanes 17 7,421 Aug-30-2018, 02:18 PM
Last Post: clarablanes

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020