Python Forum
Pandas - Dynamic column aggregation based on another column
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas - Dynamic column aggregation based on another column
#1
An algorithm runs daily and generates a file. The file can have dynamic columns in each run.

First run : country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch
Second run: country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch,3rd_purch
Third run : country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch,3rd_purch,margin_amt

Only the 'amt' columns should be divided by exchange_rate and agregated (sum). Remaining columns can be aggregated (sum) as it is.

Below is input (not output):

Output:
country date sale_amt profit_amt 1st_purch 2nd_purch exchange_rate US 2019-12-01 100 10 11 21 2 US 2019-12-02 200 20 12 22 2 US 2019-12-03 300 30 13 23 2 US 2019-12-04 400 40 14 34 2 US 2019-12-05 500 50 15 25 2
Desired Output:

Output:
country sale_amt profit_amt 1st_purch 2nd_purch US 750 75 65 125
Quote:Logic:
sale_amt=sum(sale_amt/exchange_rate)
profit_amt=sum(profit_amt/exchange_rate)
1st_purch=sum(1st_purch)
2nd_purch=sum(2nd_purch)

I wrote below code snippet, but unable to generate the correct expression to include division by exchange_rate for currency columns.
columns = df.columns
groupbyColumns = ["country"]
columnNotRequiredAgg=["date","exchange_rate,"] #
aggCols = list(set(columns) - set(columnNotRequiredAgg))

expr = {x:'sum' if 'amt' in x else 'sum' for x in aggCols } <<-- how to write correct logic
df.groupby(groupbyColumns).agg(expr)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Column Transformer with Mixed Types - sklearn aaldb 0 242 Feb-22-2024, 03:27 PM
Last Post: aaldb
  concat 3 columns of dataframe to one column flash77 2 776 Oct-03-2023, 09:29 PM
Last Post: flash77
  HTML Decoder pandas dataframe column mbrown009 3 961 Sep-29-2023, 05:56 PM
Last Post: deanhystad
  attempt to split values from within a dataframe column mbrown009 8 2,217 Apr-10-2023, 02:06 AM
Last Post: mbrown009
  Finding the median of a column in a huge CSV file markagregory 5 1,732 Jan-24-2023, 04:22 PM
Last Post: DeaD_EyE
  Make unique id in vectorized way based on text data column with similarity scoring ill8 0 861 Dec-12-2022, 03:22 AM
Last Post: ill8
  Impute 1 if previous row of 'days' column is between 0 & 7 JaneTan 2 1,050 Dec-08-2022, 07:42 PM
Last Post: deanhystad
  Increase df column values decimals SriRajesh 2 1,083 Nov-14-2022, 05:20 PM
Last Post: deanhystad
  pandas column percentile nuncio 7 2,383 Aug-10-2022, 04:41 AM
Last Post: nuncio
  how to expand each unique value in another column and fill zero if no match SriRajesh 0 814 Jul-10-2022, 09:21 AM
Last Post: SriRajesh

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020