Posts: 6
Threads: 1
Joined: Aug 2023
I have this quadruple for loop that I am looking to vectorize and I have provided a toy example to highlight the problem. This is taking a lot of computational time to run. Is there any way to reduce the computational time ?
Any suggestions ?
for i in range (0,10):
for j in range (0,20):
for k in range(0,30):
for l in range(0,40):
outer_point = [i,j]
inner_point = [k,l]
if inner_point != outer_point:
xdiff = i-k
ydiff = j-l
Posts: 6,818
Threads: 20
Joined: Feb 2020
What do you mean "vectorize". Are you using numpy or pandas?
Posts: 300
Threads: 72
Joined: Apr 2019
To replace loops, I would have a look to np.kron to have index vectors (see example here after just with 2 variables); for the other topic, np.where can give you indexes where innerpoint are different (or identical) to outer_point.
Just hints
Paul
import numpy as np
n = 5
m = 8
i = np.ones(n)
j = np.arange(m)
k = np.kron(i, j)
print(f"k = {k}")
l = np.kron(j, i)
print(f"l = {l}") Output: k = [0. 1. 2. 3. 4. 5. 6. 7. 0. 1. 2. 3. 4. 5. 6. 7. 0. 1. 2. 3. 4. 5. 6. 7.
0. 1. 2. 3. 4. 5. 6. 7. 0. 1. 2. 3. 4. 5. 6. 7.]
l = [0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 2. 2. 2. 2. 2. 3. 3. 3. 3. 3. 4. 4. 4. 4.
4. 5. 5. 5. 5. 5. 6. 6. 6. 6. 6. 7. 7. 7. 7. 7.]
Posts: 6
Threads: 1
Joined: Aug 2023
(Aug-25-2023, 02:18 PM)winash12 Wrote: I have this quadruple for loop that I am looking to vectorize and I have provided a toy example to highlight the problem. This is taking a lot of computational time to run. Is there any way to reduce the computational time ?
Any suggestions ?
for i in range (0,10):
for j in range (0,20):
for k in range(0,30):
for l in range(0,40):
outer_point = [i,j]
inner_point = [k,l]
if inner_point != outer_point:
xdiff = i-k
ydiff = j-l
Yes you are right I should have written "unrolling the for loops" and not vectorization. In the real world problem there is some numpy code along with this snippet. It's shown below
for i in range (0,10):
for j in range (0,20):
for k in range(0,30):
for l in range(0,50):
outer_point = [i,j]
inner_point = [k,l]
if inner_point != outer_point:
xdiff = i-k*dx[k,l]
ydiff = j-l*dy[k,l]
someArray[i,j] += vv[k,l]* xdiff * dx[k,l] So there is some vectorization here as well. As Paul has mentioned the ideal solution would be some usage of the numpy where clause and any hints (not looking for full blown solution) would be very helpful.
Posts: 300
Threads: 72
Joined: Apr 2019
I wouldn't say one can find a solution, but it' would be funny to try; nonetheless additional data and informations are necessary (for exemple, the strategy Will dépends on the type of data i.e if they're integer or flots typically)
Could you provide data extraits?
Posts: 6
Threads: 1
Joined: Aug 2023
Aug-27-2023, 12:40 AM
(This post was last modified: Aug-27-2023, 12:40 AM by winash12.)
The data in the arrays is floats. We can use numpy rand function to full the up the arrays with random data. Does not have to be specific at all.There is an external github link that I could share (offline perhaps) which contains a notebook with real data. But that takes a long time to run.
Posts: 300
Threads: 72
Joined: Apr 2019
Aug-27-2023, 05:49 PM
(This post was last modified: Aug-28-2023, 06:27 AM by paul18fr.)
Well you're the right person to provide inputs arrays: - to figure out on what you're working on
- because it's important to compare your result to a "vectorized" one
- any array randomly generated in a development stage will lead to more difficulties obviously
- using float is more tricky, because the number of decimals is important; for instance; do you consider that 28.5 is equivalent to 28.50000001? using float64?
In other word, provide inputs, the results you have, a brief "spec" and we'll have a look to your case ....
Posts: 6
Threads: 1
Joined: Aug 2023
Aug-28-2023, 02:29 AM
(This post was last modified: Aug-28-2023, 04:17 AM by winash12.)
Paul - fair enough and it's completely acceptable. I have provided a complete github example and the data is downloaded. One has to install few libraries and I usually do that using pip3.
https://github.com/winash12/atmospy_proj...er/mwe1.py Any questions please feel free to ask.
Posts: 300
Threads: 72
Joined: Apr 2019
why do you not want to create a basic (and fast) example that mimic what you're studying: - inputs
- brief description of what you want (I do not spend time in digging into your code)
- the expecting output using your own developements
- such approach must be used prior to perform complicated/heavy calculations
Using few dozens of rows or a billion, it'll be just a question of duration and computing ressources (memory mainly).
Some hints (first feeling on the strategy I might use): - define a threshold, otherwize "if inner_point != outer_point" might be encountered all the time
- for instance you can round values to the n decimals you want (it's a possibility)
- if you do the subtraction "inner_point - outer_point", you'll be able to search for values lower or equal to the threashold using
np.where => you'll get indexes to deal with
- etc
- even for calculation (upsi and so on), Numpy has been optimized to do it directly on arrays without loop
A lazy solution is to have a look to Numba compiler...
Paul
Posts: 6
Threads: 1
Joined: Aug 2023
Paul thanks. Yes Numba I can rule out because this is part of a product and I cannot insist the client uses a Numba compiler.
So the other hints are useful. I gave you the minimum code plus data that I thought was necessary to illustrate my problem. I apologize if it went beyond that.
The numpy where clause - that is how I vectorized similar situations in the past.
Create a boolean array based on a particular condition. Then use that boolean array in a numpy where clause and then create an assignment based on that condition.
Something like this -
has_right = s[:,2:] > -999.99
dsdx[:, 1:-1] = np.where(has_right & has_value, (s[:,2:] - s[:,1:-1]) / di, dsdx[:, 1:-1]) Here has_right and has_value are boolearn arrays.
In this current situation I was not able to visualize what would be my boolean condition since both inner_point and outer_point are tuples. Their inequality could be a boolean but not sure how to create an array of that.
|