Python Forum

Full Version: optimization problem for dataframe manipulation
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi, I am kind of new and learning python. Here I have an optimization problem and hope someone can help me through:

Let inventories be a dataframe, alpha, beta, zeta, sigma, gamma, omega, mon are known coefficients and v1 and w6 are functions for data transformation.

What I want to do is the following (original code):
for item in inventories.columns:
        x = inventories[item]                                    # get a single column
        x = x.sort_values(ascending=True)                        # sort its values in ascending order
        x.index= range(len(x))                                   # index should also be in ascending order and used in next line                    
        ind = np.where(x<0, x.index.values, n - x.index.values)  # create a serie depending on if x's value is negative/positive
        fac = np.where(x<0, -omega, zeta)                       # similar to the above
        val = np.where(x<0, -x.values, x.values)                 # similar to the above
        exponent = np.where(x<0, beta, alpha)                    # similar to the above
        v = [ fac * v1(val,exponent)*(w6((ind+1)/mon, sigma, gamma) - w6((ind)/mon, sigma, gamma))] # here is the data transformation to produce 
                                                                                                        # one value for each item
The above code runs but extremely slow because it runs one columns at a time. So Is there any way to optimize the code with python tricks to make it run faster?

Here's what I like to modify (but need to get it work):
	x = inventories
	x = pd.DataFrame(np.sort(x.values, axis=0), index=x.index, columns=x.columns) # This sorting works
	ind = np.where(x<0, x.index.values, n - x.index.values)                       # failed to broadcast due to dimentional difference
	fac = np.where(x<0, -omega, zeta)                                          # failed to broadcast due to dimentional difference
	val = np.where(x<0, -x.values, x.values)                                      # failed to broadcast due to dimentional difference
	exponent = np.where(x<0, beta, alpha)                                       # failed to broadcast due to dimentional difference
	v = fac * v1(val,exponent)*(w6((ind+1)/mon, sigma, gamma) - w6((ind)/mon, sigma, gamma)) # ???
Thanks to anyone who can help!