Python Forum
Performance enhancement - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Performance enhancement (/thread-24401.html)



Performance enhancement - fimmu - Feb-12-2020

Hi, I am new in python programming and I need help to speed up nested loop. Here is what I want to do:
Suppose I have two large size dataframe df1(for price) and df2(for volume) and I want to divide each column into two groups (positive and non-positive), then for each group I call different functions. The direct way to do this is like:

Question1:
(pseudo code):
for c in df1.columns:
    col = df1[c].sort_values(ascending=True)
    pos = col[col>0]
    neg = col[col<=0]
    return func1(pos)+func2(neg)
Question2:
assume both df1 and df2 have the same column names
for c in df1.columns:
    col1 = df1[c]
    col2 = df2[c]
    p1 = col1[col1>0]
    n1 = col1[col1<=0]
    p2 = col2[col2>0]
    n2 = col2[col2<=0]
    return p1.corrwith(p2)+n1.corrwith(n2)

This is time consuming. Is there any smart way to get it run faster? (e.g map, apply, vectorization etc)