Python Forum
Pandas - Dynamic column aggregation based on another column
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas - Dynamic column aggregation based on another column
#1
An algorithm runs daily and generates a file. The file can have dynamic columns in each run.

First run : country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch
Second run: country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch,3rd_purch
Third run : country,date,exchange_rate,sale_amt,profit_amt,1st_purch,2nd_purch,3rd_purch,margin_amt

Only the 'amt' columns should be divided by exchange_rate and agregated (sum). Remaining columns can be aggregated (sum) as it is.

Below is input (not output):

Output:
country date sale_amt profit_amt 1st_purch 2nd_purch exchange_rate US 2019-12-01 100 10 11 21 2 US 2019-12-02 200 20 12 22 2 US 2019-12-03 300 30 13 23 2 US 2019-12-04 400 40 14 34 2 US 2019-12-05 500 50 15 25 2
Desired Output:

Output:
country sale_amt profit_amt 1st_purch 2nd_purch US 750 75 65 125
Quote:Logic:
sale_amt=sum(sale_amt/exchange_rate)
profit_amt=sum(profit_amt/exchange_rate)
1st_purch=sum(1st_purch)
2nd_purch=sum(2nd_purch)

I wrote below code snippet, but unable to generate the correct expression to include division by exchange_rate for currency columns.
columns = df.columns
groupbyColumns = ["country"]
columnNotRequiredAgg=["date","exchange_rate,"] #
aggCols = list(set(columns) - set(columnNotRequiredAgg))

expr = {x:'sum' if 'amt' in x else 'sum' for x in aggCols } <<-- how to write correct logic
df.groupby(groupbyColumns).agg(expr)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Find duplicates in a pandas dataframe list column on other rows Calab 2 2,272 Sep-18-2024, 07:38 PM
Last Post: Calab
  Find strings by index from a list of indexes in a different Pandas dataframe column Calab 3 1,667 Aug-26-2024, 04:52 PM
Last Post: Calab
Question SOLVED: TTP match when final column may or may not be present Calab 1 1,052 Jul-03-2024, 02:45 PM
Last Post: Calab
  Create new column in dataframe Scott 10 3,648 Jun-30-2024, 10:18 PM
Last Post: Scott
  attempt to split values from within a dataframe column mbrown009 9 6,031 Jun-20-2024, 07:59 PM
Last Post: AdamHensley
  Putting column name to dataframe, can't work. jonah88888 2 3,302 Jun-18-2024, 09:19 PM
Last Post: AdamHensley
  Column Transformer with Mixed Types - sklearn aaldb 0 1,422 Feb-22-2024, 03:27 PM
Last Post: aaldb
  concat 3 columns of dataframe to one column flash77 2 2,169 Oct-03-2023, 09:29 PM
Last Post: flash77
  HTML Decoder pandas dataframe column mbrown009 3 2,732 Sep-29-2023, 05:56 PM
Last Post: deanhystad
  Finding the median of a column in a huge CSV file markagregory 5 3,337 Jan-24-2023, 04:22 PM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020