Python Forum

Full Version: Groupby([]).sum() Miscalculation
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all, this is my first post so please take it easy! Tongue

The following code is dropping data: e.g. SEC figures is missing "1 Insurance/Annuity Products" should be 2 total and FDIC is missing "2 Residential Mortgage" line items; any ideas what may be wrong? Thanks!

n = df1.groupby(['Year', 'State', 'Regulator', 'Industry','Product', 'Count']).sum()


Output usingcode above:

Output:
Year State Regulator Industry Product Count 2012 Alabama FDIC Depository Institution Debit Card 1 Residential Mortgage 1 OCC Depository Institution Bonds/Notes 1 Commercial Mortgage 1 Credit Card 1 Debit Card 1 3 4 Residential Mortgage 2 Stocks 1 SEC Securities/Futures Insurance/Annuity Products 1 Stocks 1 3
Correct values:

Output:
Year State Industry Regulator Product Count 2012 Alabama Depository Institution FDIC Residential Mortgage 1 2012 Alabama Depository Institution FDIC Residential Mortgage 1 2012 Alabama Depository Institution FDIC Residential Mortgage 1 2012 Alabama Depository Institution FDIC Debit Card 1 Year State Industry Regulator Product Count 2012 Alabama Securities/Futures SEC Insurance/Annuity Products 1 2012 Alabama Securities/Futures SEC Insurance/Annuity Products 1 2012 Alabama Securities/Futures SEC Stocks 3 2012 Alabama Securities/Futures SEC Stocks 1
just about impossible to say what is wrong without support code.
Please show enough code to support analysis.