##### How to sum across variable columns in a dataframe
 How to sum across variable columns in a dataframe rennerom Unladen Swallow Posts: 3 Threads: 1 Joined: Jan 2021 Reputation: Jan-27-2021, 01:56 AM (This post was last modified: Jan-27-2021, 01:56 AM by rennerom.) Hi there, I end up doing lots of this type of data manipulation in excel as I haven't found a solid python solution yet. Lets say this is my pandas dataframe, and I need to sum columns P1 through P5 into a new column called "Total". That's a fairly easy task. data = {'ID': ['a','b','c','d','e'], 'Target': [2,4,1,2,5], 'P1': [30,50,22.2,1,24.4], 'P2':[40,66.34,33,5,8], 'P3':[59,100,41,6,30], 'P4':[35,130.22,12,0,21], 'P5':[22,90,10.5,1.1,19.9] } df = pd.DataFrame(data) df['Sum']=df.iloc[:,2:].sum(axis=1) print(df)Output: ID Target P1 P2 P3 P4 P5 Total 0 a 2 30.0 40.00 59 35.00 22.0 186.00 1 b 4 50.0 66.34 100 130.22 90.0 436.56 2 c 1 22.2 33.00 41 12.00 10.5 118.70 3 d 2 1.0 5.00 6 0.00 1.1 13.10 4 e 5 24.4 8.00 30 21.00 19.9 103.30But what if I don't always need to sum to P5, what if I want to stop at P4, or P3? What if where I stop depends on what some other variable? In this dataframe, I am using the Target column to dictate how "far out" I need to sum my P columns. So for example, if Target = 2, then "Total" would be P1 + P2. If Target = 3, then "Total" would be P1 + P2 + P3, and so on. I would expect my "Totals" column to now look like this... Output: ID Target P1 P2 P3 P4 P5 Total 0 a 2 30.0 40.00 59 35.00 22.0 70.00 1 b 4 50.0 66.34 100 130.22 90.0 346.56 2 c 1 22.2 33.00 41 12.00 10.5 22.20 3 d 2 1.0 5.00 6 0.00 1.1 6.00 4 e 5 24.4 8.00 30 21.00 19.9 103.30I just can't seem to get there. The last of several attempts of this dynamic sum led me to try different combinations of iloc and sum. Something like this. df['Total']=df.iloc[:,2:COLUMN_I_WANT_TO_SUM_TO].sum(axis=1)where the COLUMN_I_WANT_TO_SUM_TO would be df['Target'] But I keep running into the same issues when I run this (or anything similar) df['Total']=df.iloc[:,2:df['Target']].sum(axis=1)While trouble shooting this error it seems it's related to conflicting indices (like the index bases are shifted if that makes sense?), but I don't quite have the python chops to trace it to its source. Error:--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 df['Sum']=df.iloc[:,2:df['Target']].sum(axis=1) 2 print(df) /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexing.py in __getitem__(self, key) 887 # AttributeError for IntervalTree get_value 888 return self.obj._get_value(*key, takeable=self._takeable) --> 889 return self._getitem_tuple(key) 890 else: 891 # we by definition only have the 0th axis /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexing.py in _getitem_tuple(self, tup) 1452 return self._getitem_lowerdim(tup) 1453 -> 1454 return self._getitem_tuple_same_dim(tup) 1455 1456 def _get_list_axis(self, key, axis: int): /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexing.py in _getitem_tuple_same_dim(self, tup) 773 continue 774 --> 775 retval = getattr(retval, self.name)._getitem_axis(key, axis=i) 776 # We should never have retval.ndim < self.ndim, as that should 777 # be handled by the _getitem_lowerdim call above. /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis) 1479 def _getitem_axis(self, key, axis: int): 1480 if isinstance(key, slice): -> 1481 return self._get_slice_axis(key, axis=axis) 1482 1483 if isinstance(key, list): /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexing.py in _get_slice_axis(self, slice_obj, axis) 1511 1512 labels = obj._get_axis(axis) -> 1513 labels._validate_positional_slice(slice_obj) 1514 return self.obj._slice(slice_obj, axis=axis) 1515 /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexes/base.py in _validate_positional_slice(self, key) 3319 """ 3320 self._validate_indexer("positional", key.start, "iloc") -> 3321 self._validate_indexer("positional", key.stop, "iloc") 3322 self._validate_indexer("positional", key.step, "iloc") 3323 /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexes/base.py in _validate_indexer(self, form, key, kind) 5307 pass 5308 else: -> 5309 raise self._invalid_indexer(form, key) 5310 5311 def _maybe_cast_slice_bound(self, label, side: str_t, kind): TypeError: cannot do positional indexing on Index with these indexers [0 2 1 4 2 1 3 2 4 5 Name: Target, dtype: int64] of type Series Reply nealc Programmer named Tim Posts: 6 Threads: 0 Joined: Jan 2021 Reputation: Jan-28-2021, 07:08 AM Hi Rennerom That's a tricky one, maybe try apply: import pandas as pd from numpy import arange data = {'ID': ['a','b','c','d','e'], 'Target': [2,4,1,2,5], 'P1': [30,50,22.2,1,24.4], 'P2':[40,66.34,33,5,8], 'P3':[59,100,41,6,30], 'P4':[35,130.22,12,0,21], 'P5':[22,90,10.5,1.1,19.9] } df = pd.DataFrame(data) df['Sum']=df.iloc[:,2:].sum(axis=1) df["varsum"]=df.apply(lambda row: row[["P"+str(i) for i in arange(1,1+row["Target"])]].sum(), axis=1) print(df) ID Target P1 P2 P3 P4 P5 Sum varsum 0 a 2 30.0 40.00 59 35.00 22.0 186.00 70.00 1 b 4 50.0 66.34 100 130.22 90.0 436.56 346.56 2 c 1 22.2 33.00 41 12.00 10.5 118.70 22.20 3 d 2 1.0 5.00 6 0.00 1.1 13.10 6.00 4 e 5 24.4 8.00 30 21.00 19.9 103.30 103.30 Reply rennerom Unladen Swallow Posts: 3 Threads: 1 Joined: Jan 2021 Reputation: Jan-31-2021, 05:44 PM That was it! Thanks @nealc Reply

 Possibly Related Threads… Thread Author Replies Views Last Post Apply fillna to multiple columns in dataframe rraillon 2 645 Aug-05-2021, 01:11 PM Last Post: rraillon How to rename dataframe columns based on the content in an index? ar_mahdavi 2 1,031 Jun-07-2021, 06:09 AM Last Post: ricslato How to split dataframe object rows to columns Mekala 1 1,014 Nov-12-2020, 04:18 PM Last Post: michael1789 convert list to five columns dataframe in sequence tonycat 2 1,201 Sep-29-2020, 06:47 AM Last Post: tonycat How to melt dataframe multiple columns to one column Mekala 1 1,271 Sep-24-2020, 08:32 PM Last Post: scidam Concatenate two files with different columns into one dataframe moralear27 1 1,023 Sep-11-2020, 10:18 PM Last Post: moralear27 How to map dataframe based on two columns Mekala 0 3,131 Aug-29-2020, 07:36 AM Last Post: Mekala Cmparing columns in dataframe Rejoice 0 747 Aug-17-2020, 08:48 PM Last Post: Rejoice Fuzzy match on text columns within dataframe Nsaibot 0 3,222 Aug-27-2018, 10:52 PM Last Post: Nsaibot Newbie question to use lambda on multiple columns of a dataframe zydjohn 0 4,926 Jan-23-2018, 06:08 PM Last Post: zydjohn

Forum Jump:

### User Panel Messages

##### Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020