Pandas: summing columns conditional on the column labels - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Pandas: summing columns conditional on the column labels (/thread-29574.html) |
Pandas: summing columns conditional on the column labels - ddd2332 - Sep-10-2020 Hello everyone. I am using an input and output dataset to calculate the summation of columns conditional on the name of columns. For example, the data in the dataframe looks as below. Country | IndustryCode | US1 | US2 | US3 | Canada1 | Canada2 | Canada3 | China1 | China2 | China3 | ... US | 1 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... US | 2 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... US | 3 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... Canada | 1 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... Canada | 2 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... Canada | 3 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... China | 1 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... China | 2 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... China | 3 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... ... | ... | .. | .. | .. | .. | .. | .. | .. | .. | .. | ... What I want to do is to add an additional column that will sum all columns that has China or Canada as a column label for each industry. In other words, if I add correctly, I should have the below dataframe: Country | IndustryCode | US1 | US2 | US3 | Canada1 | Canada2 | Canada3 | China1 | China2 | China3 | ... | Csum1 | Csum2 | Csum3 US | 1 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 US | 2 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 US | 3 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 Canada | 1 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 Canada | 2 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 Canada | 3 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 China | 1 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 China | 2 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 China | 3 | 1 | 4 | 10 | 3 | 4 | 1 | 3 | 1 | 3 | ... | 6 | 5 | 4 ... | ... | .. | .. | .. | .. | .. | .. | .. | .. | .. | ... | 6 | 5 | 4 To do this, I thought that I should first create the additional columns in the dataset for each industry using for loop. Then, I would add another loop in this look to check if each column's label contains the China or Canada. for x in rance(1,4): locals()[df["Csum"+str(x)]] = 0 for y in [US Canada China]: #I was not sure how I can approach this part...Could anyone give me an advice as to how I can approach this issue? Thank you in advance for your help! |