Pandas - Dynamic column aggregation based on another column - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Pandas - Dynamic column aggregation based on another column (/thread-25979.html) |
Pandas - Dynamic column aggregation based on another column - theroadbacktonature - Apr-17-2020 An algorithm runs daily and generates a file. The file can have dynamic columns in each run. First run : country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch Second run: country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch,3rd_purch Third run : country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch,3rd_purch,margin_amt Only the 'amt' columns should be divided by exchange_rate and agregated (sum). Remaining columns can be aggregated (sum) as it is. Below is input (not output): Desired Output:
Quote:Logic: I wrote below code snippet, but unable to generate the correct expression to include division by exchange_rate for currency columns. columns = df.columns groupbyColumns = ["country"] columnNotRequiredAgg=["date","exchange_rate,"] # aggCols = list(set(columns) - set(columnNotRequiredAgg)) expr = {x:'sum' if 'amt' in x else 'sum' for x in aggCols } <<-- how to write correct logic df.groupby(groupbyColumns).agg(expr) |