Bottom Page

• 0 Vote(s) - 0 Average
• 1
• 2
• 3
• 4
• 5
 Trying to replace for loops with numpy vspoloni Unladen Swallow Posts: 4 Threads: 1 Joined: Jul 2019 Reputation: 0 Likes received: 0 #1 Jul-30-2019, 08:00 AM Hello all, I just recently started working on python, and I am currently working on a Simulation that reads the distances between a ball and the Floor through 11 time steps and finds the minimum. The problem is that the ball and the Floor have different number of Elements, so I am using three for lops to run the program, which takes a long time. The Code is computing the data of a Matrix with 11 rows (for each time step) and 3 collumns (x,y,z axis). I am using the qd database to run this example. Also, the Code runs around 20 Million calculations. ```for t in range(tsteps): # Starting a loop for all time steps for i in range(lp1): # Starting the loop for all the elements in part 1 el1 = elpart1[i].get_coords() # This Returns the coordenates of the Elements as an np.ndarray (11 x 3) el1_id = elpart1[i].get_id() # This helps with identifying the Elements for j in range(lp2): # Starting the loop for all elements in part 2 el2 = elpart2[j].get_coords() el2_id = elpart2[j].get_id() # Using an equation to find the distances between coordinates of each part: dist = ((el2[t, 0] - el1[t, 0])**2 + (el2[t, 1] - el1[t, 1])**2 + (el2[t, 2] - el1[t, 2])**2)**(1/2) # Adding the elements to the empty lists that were created before the loops started dist_list.append(dist) element_pair_list.append([el1_id, el2_id])```How could I replace the for Loops to make the Code run more efficiently? Any help is much appreciated! paul18fr Wafer-Thin Wafer Posts: 86 Threads: 22 Joined: Apr 2019 Reputation: 3 Likes received: 2 #2 Jul-30-2019, 06:15 PM (This post was last modified: Jul-30-2019, 06:15 PM by paul18fr. Edited 5 times in total.) Maybe can you provide an example in order to give you some advice(s)? Maybe you can reduce the number of loop(s) (only onbe one per time step typically), and you 'll be able to speedup your calculation for sure. An example, using append is very costly (dynamic memory allocation = a new matrix is created for each append call, to add a new row) and you can easily avoid it either by creating a matrix, or by adding a column a matrix; numpy is implicitly vertorized and it's fast if it's used correctly. Share a bit more and the community will help you (I've demonstrated to an intern that it's possible to reduce the duration from more than an hour to 40 seconds - 3 nested loops were used against no one finally; it depends on the topic). here after a basic example from the numpy arrays/matrixes you get: ```#!/usr/bin/env python3 # -*- coding: utf-8 -*- import numpy as np import time n = 1_000_000; # elt1 and elt2 are supposed to be 2 matrixes; the dimensions are (n,4) # column 1 = number of element #column 2/3/4 = X/Y/Z coordinates ## Matrixes build-up; the first column are the elements number elt1 = np.random.random((n,4)); elt2 = np.random.random((n,4)); i = np.arange(1,n+1, dtype = np.float); elt1[:,0] = i; elt2[:,0] = i; # now the dimension is supposed to be unknown t0 = time.time(); nr = np.shape(elt1); # nr = number of rows dist = np.sqrt( (elt2[:, 1] - elt1[:, 1])**2 + (elt2[:, 2] - elt1[:, 2])**2 + (elt2[:, 3] - elt1[:, 3])**2 ) t1 = time.time(); loc_minimum = np.where(dist == min(dist)); minimum = float(dist[loc_minimum]); print("The minimum is {} and located at the row number {}".format(minimum,int(loc_minimum))); print("Duration : {}".format(t1-t0)); ```Paul vspoloni Unladen Swallow Posts: 4 Threads: 1 Joined: Jul 2019 Reputation: 0 Likes received: 0 #3 Jul-31-2019, 12:43 PM Thank you for your reply! I was trying to symplify everything while explaining the problem but it lacks a some important details. So as I mentioned in the beguining, the program computes the distances between a ball and the Floor through 11 time steps and finds the minimum distance. It does so by comparing the distances between all elements of each part (ball which is the first part or el1, and floor, the second part, el2), but the parts have a different number of elements each (2048 for the ball and 900 for the floor). These elements are basicaly the average of node coordenates in a certain area of the part. I'm using the loop "for i in range(lp1)" (part 1) so that it starts with the first element of that first part and then a second loop, that compares the first element of the first part to all elements of the second part, and them goes forward repeating the same steps for all elements of part 1. In that way, the code is comparing the distance between all the elements of each part and finding their minimum (which would be a case of "contact"). So, other than using an alternative to the append command, is there a way to replace the for loops that goes around each individual element at a time? Basicaly what im trying to do is to compare 2048 11x3 matrices to 900 also 11x3 matrices, without having to use the for loops to deal with the difference in the number of matrices. Hope this helps clarifying what I am trying to say! I'm sorry for my vocabulary regarding python terms, I have just started using the program so I'm still not familiar with them. paul18fr Wafer-Thin Wafer Posts: 86 Threads: 22 Joined: Apr 2019 Reputation: 3 Likes received: 2 #4 Jul-31-2019, 01:14 PM (This post was last modified: Jul-31-2019, 08:07 PM by paul18fr. Edited 1 time in total.) Are you speaking about the minimum distance between 2 meshes (nodes and elements)? (if so, you can probably reduce the number of calculations by looking if the element normal is well oriented for example; just an idea) I guess the floor is not flat and that's why you've to calculate the "mutual" distance between each element of each part (ball vs floor); I would say that we may certainly improve your code, but more information's are needed: can you provide some extract of the ball elements and the floor ones (few number of each)? Paul vspoloni Unladen Swallow Posts: 4 Threads: 1 Joined: Jul 2019 Reputation: 0 Likes received: 0 #5 Jul-31-2019, 03:38 PM Thanks again for your reply! You are correct, there are nodes and elements for each part. The floor is actually flat in the start, but it deflects. Also, the code is supposed to be flexible so it will work for different scenarios, thats why I'm using all elements even for the floor. I went back to the code one more time and I realized that the coordenates actualy come in a 1x3 format, bu are repeated 11 times for each different time step: (output from python) [[20.15 -7.75 -5.69]] [[-68.92 -73.01 -25.47]] This is an example of the first coordenate of the first and second part for the first time step. Then I use the equation: `dist = ((el2[t, 0] - el1[t, 0]) ** 2 + (el2[t, 1] - el1[t, 1]) ** 2 + (el2[t, 2] - el1[t, 2]) ** 2) ** (1 / 2)`Which will compute the distance between this two coordinates paul18fr Wafer-Thin Wafer Posts: 86 Threads: 22 Joined: Apr 2019 Reputation: 3 Likes received: 2 #6 Jul-31-2019, 09:15 PM (This post was last modified: Jul-31-2019, 09:15 PM by paul18fr. Edited 2 times in total.) well I've been thinking that with few thousand of elements, some loop may be avoided, using Kronecker product (np.kron); Just an example with 5 million of calculations (with no loop): ```import numpy as np import time n1 = 1_000; # number of nodes in the ball n2 = 5_000; # number of nodes in the floor ## Matrixes build-up; the first column are the elements number elt1 = np.random.random((n1,4)); # ball elt2 = np.random.random((n2,4)); # floor i1 = np.arange(0,n1, dtype = np.float); # float necessarily i2 = np.arange(n1,n1+n2, dtype = np.float); # just to have different numbers i2b = np.arange(0,n2, dtype = np.float); elt1[:,0] = i1; # the node number does not care elt2[:,0] = i2; # ball: the vect 1 is n1 blocs of n2 rows # equivalent to: vect 1 = [0 0 0 ... 0 0 0 1 1 1 1 ... 1 1 1 and son on], # floor; vect 2 is the blox of n2 nodes (floor) repeated n1 times # vect 1 and vect 2 are vectors of indexes of elt1 and elt2 respectively t0 = time.time(); j1 = np.ones(n2, dtype = np.int); # warning: indexes must be integer j2 = np.ones(n1, dtype = np.int); # warning: indexes must be integer i1 = i1.astype(int); # warning: indexes must be integer i2b = i2b.astype(int); # warning: indexes must be integer vect1 = np.kron(i1,j1); vect2 = np.kron(j2,i2b); NumberOfRows = np.shape(vect1); k = np.arange(0,NumberOfRows, dtype = np.int); distance_vector = np.sqrt( (elt2[vect2[k], 1] - elt1[vect1[k], 1])**2 + (elt2[vect2[k], 2] - elt1[vect1[k], 2])**2 + (elt2[vect2[k], 3] - elt1[vect1[k], 3])**2 ) minimum_distance = np.min(distance_vector); t1 = time.time(); ## check the distance between node1 of the ball to the node 1 of the floor # versus distance_vector check = np.sqrt( (elt2[0, 1] - elt1[0, 1])**2 + (elt2[0, 2] - elt1[0, 2])**2 + (elt2[0, 3] - elt1[0, 3])**2 ) diff = check - distance_vector; print("if diff equiv 0. then ok; here diff = {}".format(diff)); print("the {} calculations took {} second".format(NumberOfRows,t1-t0)); ```(seems to be good - it tooks 0.7 second on my old laptop with 6 Go of RAM) Nonetheless the amont of memory becomes too high with several million of elements i.e. for a more general modelling, and using loops cannot be avoided in my mind; I'm wondering if Numba can help (I've never used it so far, but it seems promising if the necessary numpy capabilities have been implemented). Your solution may be a mix between loops (1 per time step) and something close to the previous example Hope it helps you in a certain way Paul paul18fr Wafer-Thin Wafer Posts: 86 Threads: 22 Joined: Apr 2019 Reputation: 3 Likes received: 2 #7 Aug-01-2019, 08:53 AM Hi I took the opportunity to build a more realistic model, and to validate it using a professional tool; I got the same results (see pictures).   Unfortunately I don't know how to share the files on the forum, especially the original meshes (send me a PM); I just can share the python file hope it helps a bit Paul ```import numpy as np; import os, time; import h5py; PATH = str(os.getcwd()); FileNameHDF5 = 'Distance_min.h5'; t0 = time.time(); with h5py.File(PATH + '/' + FileNameHDF5, 'r') as h5: data = h5.get('Saddle/Saddle_nodes'); Floor = np.array(data); del data; data = h5.get('Sphere/Sphere_nodes'); Ball = np.array(data); del data; ## get the number of rows/nodes per part n1 = np.shape(Floor); # for the floor n2 = np.shape(Ball); # for the ball index1 = np.arange(0,n1, dtype = np.int); index2 = np.arange(0,n2, dtype = np.int); # ball: the vect 1 is n1 blocs of n2 rows # vect 1 = [0 0 0 ... 0 0 0 1 1 1 1 ... 1 1 1 and son on], # floor; vect 2 is the blox of n2 nodes (floor) repeated n1 times # vect 1 and vect 2 are vectors of indexes of floor and ball respectively j1 = np.ones(n2, dtype = np.int); # warning: indexes must be integer j2 = np.ones(n1, dtype = np.int); # warning: indexes must be integer vect1 = np.kron(index1,j1); # vect 1 returns the index of the Floor rows vect2 = np.kron(j2,index2); # vect 2 returns the same for the Ball rows NumberOfRows = np.shape(vect1); k = np.arange(0,NumberOfRows, dtype = np.int); # k allows to # distance_vector = np.sqrt( (Ball[vect2[k], 1] - Floor[vect1[k], 1])**2 + (Ball[vect2[k], 2] - Floor[vect1[k], 2])**2 + (Ball[vect2[k], 3] - Floor[vect1[k], 3])**2 ) minimum_distance = np.min(distance_vector); del n1; del n2; del j1; del j2; del vect1; del vect2; t1 = time.time(); print("The minimum distance has been found to {}".format(minimum_distance)) print("the {} calculations took {} second".format(NumberOfRows,t1-t0)); ``` vspoloni Unladen Swallow Posts: 4 Threads: 1 Joined: Jul 2019 Reputation: 0 Likes received: 0 #8 Aug-02-2019, 08:41 AM That is amazing, thank you so much! I think I'll have to take some time to study your method as this is pretty new to me, but it definetely helps! paul18fr Wafer-Thin Wafer Posts: 86 Threads: 22 Joined: Apr 2019 Reputation: 3 Likes received: 2 #9 Aug-04-2019, 07:09 AM (This post was last modified: Aug-04-2019, 07:09 AM by paul18fr. Edited 2 times in total.) Hi If you want to test the power of vectorization, try the following basic example: ```import numpy as np import time def square_function(x): return x**2 n = 10_000_000; # 10 million itérations # case 1: using loops y1 = np.zeros(n, dtype = np.float); # initialization t0 = time.time(); for i in range(n): y1[i] = 0.1*i**2; t1 = time.time(); duration1 = t1 - t0; print("Duration using loops = {} seconds".format(duration1)); #case 2: using vectorization y2 = np.zeros(n, dtype = np.float); # initialization t0 = time.time(); i = np.arange(n, dtype = np.float); # vector of indexes is created y2 = 0.1*i**2; t1 = time.time(); duration2 = t1 - t0; print("Duration using Vectorization = {} seconds".format(duration2)); ## ratio calculation print("Vectorization is {} times faster than loops use!!!".format(duration1 / duration2)); print("we check there's not difference: min(y2-y1) = {} and max(y2-y1) = {}".format(np.min(y2-y1),np.max(y2-y1))); del y2; del y1; # case 3: using a function works as well y3 = np.zeros(n, dtype = np.float); # initialization t0 = time.time(); i = np.arange(n, dtype = np.float); # vector of indexes is created y3 = square_function(i); t1 = time.time(); duration3 = t1 - t0; print("Duration using Vectorization + function = {} seconds".format(duration3)); del y3; ```Some basics on vectorization I'm currently writing down: Concerning the 2 vectors resulting from Kronecker product, a more "visual" approach: Paul « Next Oldest | Next Newest »

Top Page

 Possibly Related Threads... Thread Author Replies Views Last Post numpy.copy / numpy.delete paul18fr 3 502 Jul-26-2019, 01:51 PM Last Post: paul18fr "erlarge" a numpy-matrix to numpy-array PhysChem 2 585 Apr-09-2019, 04:54 PM Last Post: PhysChem

Forum Jump:

Users browsing this thread: 1 Guest(s)