Python/Numpy have I already written the swiftest code for large array?

squenson · Dec-12-2017, 11:56 PM

The issue is that b has a length equal to np.amax(a) + 1 which is between 1 and 5 billion. Scanning this list to find the 3's takes a lot of time. I suggest that you build a dictionary with the values of a as keys and the number of occurrences as values, then you can retrieve all the keys which have the value 3. Something like this (not tested!):

# Create a dictionary with elements of a
# and the number of times they appear
mydict = {}
for each e in a:
    if e in mydict:
        mydict[e] += 1
    else:
        mydict[e] = 1

# Select keys which have the value 3
c = [k for k, v in mydict.items() if v == 3]

So instead of scanning 3 billion items (average), you will create a dictionary that requires 3 million entries by 22 times to access a single key (2**25 > 3,000,000) plus another pass to produce the final output, meaning less than 100 millions operations. More important, the time of execution is now depending on the number of elements in a, not the largest value in a.

Let us know your final code and the results of your tests!

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	[Numpy] How to store different data type in one numpy array?	water	7	634	Mar-26-2024, 02:18 PM Last Post: snippsat
	reshaping 2D numpy array	paul18fr	3	1,024	Jan-03-2023, 06:45 PM Last Post: paul18fr
	Numpy returns "TypeError: unsupported operand type(s) for *: 'numpy.ufunc' and 'int'"	kalle	2	2,642	Jul-19-2022, 06:31 AM Last Post: paul18fr
	Numpy array	BrianPA	13	4,989	Jan-23-2021, 09:36 AM Last Post: Serafim
	How to fill datetime64 field in numpy structured array?	AlekseyPython	0	2,294	Oct-20-2020, 08:17 AM Last Post: AlekseyPython
	Adding data in 3D array from 2D numpy array	asmasattar	0	2,240	Jul-23-2020, 10:55 AM Last Post: asmasattar
	converting dataframe to int numpy array	glennford49	1	2,330	Apr-04-2020, 06:15 AM Last Post: snippsat
	Replacing sub array in Numpy array	ThemePark	5	4,206	Apr-01-2020, 01:16 PM Last Post: ThemePark
	How to prepare a NumPy array which include float type array elements	subhash	0	1,927	Mar-02-2020, 06:46 AM Last Post: subhash
	numpy.where array search for string in just one coordinate	adetheheat	1	2,294	Jan-09-2020, 07:09 PM Last Post: paul18fr

Python/Numpy have I already written the swiftest code for large array?

User Panel Messages

Announcements