Python Forum

Full Version: Groupby in pandas with conditional - add and subtract
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have a dataframe with 4 columns "Identificação Única", "Nome", "Rubrica" and "Valor" and I would like to groupby the column "Identificação Única" e "Nome", and sum the column Valor, except when Rubrica is 240 or 245. Also, I want to minus the value in column Valor always Rubrica is 352 and add others(that is, different from "240","245" and "352")) and its result will be kept in the new column. And, I want to show the value in this column that is higher than 10000. The code is below: This way, the code add correctly all of rubricas except 240 and 245. But, I can not subtract the rubrica when it is 352. A picture of the dataframe is bellow:https://i.stack.imgur.com/WoWB9.jpg

The code that I have tried is this:

f0219_grouped = f0219[~f0219['Código Rubrica'].isin(['240','245'])].groupby(['Identificação Única', 'Nome'])['Valor'].sum()>10000 - f0219_grouped['Valor'] where f0219_grouped['Rubrica']==352`
And, when I run the code, I got the following error Message: `File "", line 1

f0219_grouped = f0219[~f0219['Código Rubrica'].isin(['240','245'])].groupby(['Identificação Única', 'Nome'])['Valor'].sum()>10000 - f0219_grouped['Valor'] where f0219_grouped['Rubrica']==352
                                                                                                                                                               ^
SyntaxError: invalid syntax`
python pandas
i guess you missed a column while doing groupby.. Apart from that you can make the below minor adjustments

f0219_grouped = f0219[~f0219['Rubrica'].isin(['240','245'])].groupby(['Identificação Única', 'Nome','Rubrica'])['Valor'].sum().reset_index()
f0219_grouped['Valor']=(f0219_grouped['Valor'] - (f0219_grouped['Valor'].where(f0219_grouped['Rubrica']==352)).fillna(0))
f0219_grouped[f0219_grouped["Valor"]>=10000]
Hi, thank you, but is not working. It is not adding all the values where Rubrica is not 240 or 245, and I need that all values belong to these two codes(245 and 240) is added.