Jul-22-2020, 06:09 PM
(This post was last modified: Jul-22-2020, 07:28 PM by Gribouillis.)
Here is a devilish suggestion. I made the following experiment
and then I retrieve the sorted numpy array by reading the file as bytes.
Bsort's reputation is to be extremely efficient and also it can presumably handle large files in the same way as Gnu's sort does.
This could be the solution of this problem.
OOPS, there is still an issue, it seems that the array is not yet sorted. It could be a byteorder problem.
YES! the following version works on wy machine
>>> s array([9.94223926e-01, 7.55188959e-01, 2.87075284e-04, 6.60265593e-01, 3.12176498e-02, 3.01580980e-01, 9.79960201e-01, 2.37826251e-01, 1.74042656e-01, 1.39546100e-02, 2.14055048e-01, 8.73880775e-01, 5.12656017e-01]) >>> filename = 'paillasse/foo.bin' >>> Path(filename).write_bytes(s.tobytes()) 104 >>> subprocess.call('bsort -k 8 -r 8 ' + filename, shell=True) 0 >>> ss = np.frombuffer(Path(filename).read_bytes()) >>> ss array([2.87075284e-04, 3.12176498e-02, 9.79960201e-01, 2.37826251e-01, # STILL WRONG, SEE BELOW 9.94223926e-01, 5.12656017e-01, 6.60265593e-01, 8.73880775e-01, 3.01580980e-01, 7.55188959e-01, 1.39546100e-02, 2.14055048e-01, 1.74042656e-01])In other words, I save the array as bytes in a file, I use the external progam bsort to perform inplace binary sort on this file
and then I retrieve the sorted numpy array by reading the file as bytes.
Bsort's reputation is to be extremely efficient and also it can presumably handle large files in the same way as Gnu's sort does.
This could be the solution of this problem.
OOPS, there is still an issue, it seems that the array is not yet sorted. It could be a byteorder problem.
YES! the following version works on wy machine
from pathlib import Path import numpy as np import subprocess s = np.array([9.94223926e-01, 7.55188959e-01, 2.87075284e-04, 6.60265593e-01, 3.12176498e-02, 3.01580980e-01, 9.79960201e-01, 2.37826251e-01, 1.74042656e-01, 1.39546100e-02, 2.14055048e-01, 8.73880775e-01, 5.12656017e-01]) filename = 'paillasse/foo.bin' Path(filename).write_bytes(s.byteswap().tobytes()) subprocess.call(['bsort', '-k', '8', '-r', '8', filename]) ss = np.frombuffer(Path(filename).read_bytes()).byteswap() print(s) print(ss) s.sort() print('Sorted ?', np.array_equal(s, ss))
Output:[9.94223926e-01 7.55188959e-01 2.87075284e-04 6.60265593e-01
3.12176498e-02 3.01580980e-01 9.79960201e-01 2.37826251e-01
1.74042656e-01 1.39546100e-02 2.14055048e-01 8.73880775e-01
5.12656017e-01]
[2.87075284e-04 1.39546100e-02 3.12176498e-02 1.74042656e-01
2.14055048e-01 2.37826251e-01 3.01580980e-01 5.12656017e-01
6.60265593e-01 7.55188959e-01 8.73880775e-01 9.79960201e-01
9.94223926e-01]
Sorted ? True