How to use vectorization instead of for loop to improve efficiency in python?

How to use vectorization instead of for loop to improve efficiency in python? - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: How to use vectorization instead of for loop to improve efficiency in python? (/thread-32358.html)

How to use vectorization instead of for loop to improve efficiency in python? - PJLEMZ - Feb-05-2021

Hi all!

I've been having trouble with the run time of my code as it is taking too long. I imagine this is because I am using for loops which is what I'm used to in MATLAB where for loops seems to be quicker than in Python. I've been trying for a while to use NumPy vectorization methods to do this instead as I have read this is better in terms of efficiency. However, I seem to be having issues implementing this. I would very much appreciate any guidance on how to do this. If there are other methods you suggest to improve performance, do feel free to advise me. Smile

The loop I am trying to do this on is:

for i in range(100000):
         theta = theta_rec[:][i]
         A = np.zeros((50,50))

          for k in range(50):
            for j in range(50):
              A[k,j] = math.sin(theta[j] - theta[k])

          B = -1 * np.array(sum(A))
          d_theta = 0.001*(5 + B)
          theta[(i+1),:] = theta_rec[i,:] + d_theta

          for m in range(50):
            D[(i+1),m] = theta_rec[(i+1),m]%(2.0*math.pi)

where theta_rec is a 100001x50 array. As you can see this deeply nested for loop isn't good for runtime and is acting as a bottleneck in my code.

In other words, I want to replace this for loop with some sort of NumPy vectorization function as this will improve run time performance. In particular, I think the inner 2 loops would benefit from this.

Any advice would be greatly appreciated!

Thank you!

RE: How to use vectorization instead of for loop to improve efficiency in python? - Tuxedo - Feb-05-2021

How are D and theta_rec initialized?

RE: How to use vectorization instead of for loop to improve efficiency in python? - PJLEMZ - Feb-05-2021

(Feb-05-2021, 03:17 AM)Tuxedo Wrote: How are D and theta_rec initialized?

Hi , thanks for getting back to me.

theta_rec is initiliazed like so:

theta_rec = np.zeros((50,100001))
for randtheta0 in range(50):
  r.seed(randtheta0)                     
  theta_rec[0,randtheta0] = 2.0*math.pi*r.random()

And D is initialized like so:

D = np.zeros((50, 100001))
for i_circavg in range(50):
  D[0,i_circavg] = theta_rec[0,i_circavg]%(2.0*math.pi)

RE: How to use vectorization instead of for loop to improve efficiency in python? - Tuxedo - Feb-05-2021

Can't seem to get your code to run:

theta[(i+1),:] = theta_rec[i,:] + d_theta

ValueError: operands could not be broadcast together with shapes (100001,) (50,)

import numpy as np
import math
import random as r

theta_rec = np.zeros((50,100001))
for randtheta0 in range(50):
    r.seed(randtheta0)                     
    theta_rec[0,randtheta0] = 2.0*math.pi*r.random()
    
D = np.zeros((50, 100001))
for i_circavg in range(50):
    D[0,i_circavg] = theta_rec[0,i_circavg]%(2.0*math.pi)    
    

for i in range(100000):
    theta = theta_rec[:][i]
    A = np.zeros((50,50))
 
    for k in range(50):
        for j in range(50):
            A[k,j] = math.sin(theta[j] - theta[k])
 
    B = -1 * np.array(sum(A))
    d_theta = 0.001*(5 + B)
    theta[(i+1),:] = theta_rec[i,:] + d_theta
 
    for m in range(50):
        D[(i+1),m] = theta_rec[(i+1),m]%(2.0*math.pi)

RE: How to use vectorization instead of for loop to improve efficiency in python? - paul18fr - Feb-06-2021

Hi

Without going deeper in you code, let me giving you some advices to speed up your code using vectorization:

1) Indexes basic loop (1 index)

Instead of using a loop for the index, you can create a vector

n = 100_000
i = np.arange(n)

See example here

2) indexes for 2 imbricated loops

np.kron should do the job (example in this post); pay attention to the size of vectors and the amount of memory (that's the main limitation you might confront to).

In the following example, indexes i,j are replaced by index1 and index2

import numpy as np
n1 = 3
n2 = 5
i1 = np.arange(0,n1)
i2 = np.arange(0,n2)
j1 = np.ones(n2)
j2 = np.ones(n1)
index1 = np.kron(i1,j1)
index2 = np.kron(j2,i2)

print("Index1: {}".format(index1))
print("Index2: {}".format(index2))

It's the same strategy as for 1) but on 2 indexes

hope it helps

Paul