Python Forum

Full Version: Performance enhancement
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi, I am new in python programming and I need help to speed up nested loop. Here is what I want to do:
Suppose I have two large size dataframe df1(for price) and df2(for volume) and I want to divide each column into two groups (positive and non-positive), then for each group I call different functions. The direct way to do this is like:

Question1:
(pseudo code):
for c in df1.columns:
    col = df1[c].sort_values(ascending=True)
    pos = col[col>0]
    neg = col[col<=0]
    return func1(pos)+func2(neg)
Question2:
assume both df1 and df2 have the same column names
for c in df1.columns:
    col1 = df1[c]
    col2 = df2[c]
    p1 = col1[col1>0]
    n1 = col1[col1<=0]
    p2 = col2[col2>0]
    n2 = col2[col2<=0]
    return p1.corrwith(p2)+n1.corrwith(n2)

This is time consuming. Is there any smart way to get it run faster? (e.g map, apply, vectorization etc)