Apr-15-2019, 10:05 AM
Look at the following code:
and assigned values to the array without any loops, but using vectorized notation. Numpy is implemented
in C and such operations executed much faster, than they would implemented as raw python loops.
So, you need to rewrite your data processing algorithm using vectorized notation of numpy (avoiding loops), or, if you need to find some substring in another string, using regular expressions (re module).
# big_Erdos1=list(".... long string.... import numpy as np def one(): big_Erdos2="" for i in range(0,len(big_Erdos1)): if big_Erdos1[i]=="0": big_Erdos2+="1" else: big_Erdos2+="0" return big_Erdos2 res = np.array(["0"]*len(big_Erdos1)) data = np.array(big_Erdos1) def two(): res[ data == "0"] = "1" return res.tolist()If you measure time of execution for
one()
and two()
, you get result like this: Output:> %timeit -n 1000 one()
258 µs ± 5.78 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> %timeit -n 1000 two()
49.3 µs ± 2.93 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
The second solution is about 5x times faster than previous. This is because I used numpy
and assigned values to the array without any loops, but using vectorized notation. Numpy is implemented
in C and such operations executed much faster, than they would implemented as raw python loops.
So, you need to rewrite your data processing algorithm using vectorized notation of numpy (avoiding loops), or, if you need to find some substring in another string, using regular expressions (re module).