Python Forum
Pandas confused - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Pandas confused (/thread-34943.html)



Pandas confused - DPaul - Sep-18-2021

I use pandas occasionally to read excel files.
I found out that I can also use it to calculate subtotals
using group(...). Works fine. Except when i print the result
of the subtotal after grouping, it displays 0,1... on top,
are these indexes? Example sites don't seem comment on that,
so I must be missing something. What ?
thx,
Paul
import pandas as panda

Q = [['AF-10', 744.42],
['AF-10', 1243.68],
['AF-20', 90.0],
['BA-40', -1425.0],
['BA-40', 1425.0],
['BA-40', 1425.0],
['BA-40', 1425.0],
['BA-50', 150.0],
['BO-30', 16514.61],
['BR-10', 528.35]]  # etc....
Qdf = panda.DataFrame(Q)
print(Qdf.groupby([0]).sum())
Output:
1 0 AF-10 1988.10 AF-20 90.00 BA-40 2850.00 BA-50 150.00 BO-30 16514.61 BR-10 528.35



RE: Pandas confused - snippsat - Sep-18-2021

(Sep-18-2021, 07:26 AM)DPaul Wrote: Except when i print the result
of the subtotal after grouping, it displays 0,1... on top,
are these indexes?
No only 1 are index use df.columns to see,have to reset the index.
>>> df = Qdf.groupby([0]).sum()
>>> df
              1
0              
AF-10   1988.10
AF-20     90.00
BA-40   2850.00
BA-50    150.00
BO-30  16514.61
BR-10    528.35
>>> df.columns
Int64Index([1], dtype='int64')
>>> 
>>> df = df.reset_index()
>>> df.columns
Int64Index([0, 1], dtype='int64')
>>> df
       0         1
0  AF-10   1988.10
1  AF-20     90.00
2  BA-40   2850.00
3  BA-50    150.00
4  BO-30  16514.61
5  BR-10    528.35

>>> df[0]
0    AF-10
1    AF-20
2    BA-40
3    BA-50
4    BO-30
5    BR-10
Name: 0, dtype: object



RE: Pandas confused - DPaul - Sep-18-2021

Thanks for your speedy answer. Leaves me speechless.
I wonder if somebody thought of the KISS rule,
when he/she implemented this Smile
Paul


RE: Pandas confused - DPaul - Sep-18-2021

This also seems to do the trick:

Qlst = Qdf.values.tolist()
print(Qlst)

Paul


RE: Pandas confused - snippsat - Sep-18-2021

Yes if the goal to take data out Pandas to a list then it will work,as column index don't get used.
If take that list back into DataFrame then it will automatically give column index.
>>> data = Qdf.values.tolist()
>>> data
[['AF-10', 744.42],
 ['AF-10', 1243.68],
 ['AF-20', 90.0],
 ['BA-40', -1425.0],
 ['BA-40', 1425.0],
 ['BA-40', 1425.0],
 ['BA-40', 1425.0],
 ['BA-50', 150.0],
 ['BO-30', 16514.61],
 ['BR-10', 528.35]]
>>> type(data) # A normal Python list
<class 'list'>
>>> 
>>> import pandas as pd
>>> 
>>> df = panda.DataFrame(data)
>>> df
       0         1
0  AF-10    744.42
1  AF-10   1243.68
2  AF-20     90.00
3  BA-40  -1425.00
4  BA-40   1425.00
5  BA-40   1425.00
6  BA-40   1425.00
7  BA-50    150.00
8  BO-30  16514.61
9  BR-10    528.35
>>> df = df.rename(columns={0: 'Name', 1: 'Cost'})
>>> df
    Name      Cost
0  AF-10    744.42
1  AF-10   1243.68
2  AF-20     90.00
3  BA-40  -1425.00
4  BA-40   1425.00
5  BA-40   1425.00
6  BA-40   1425.00
7  BA-50    150.00
8  BO-30  16514.61
9  BR-10    528.35
>>> type(df) # A DataFrame
<class 'pandas.core.frame.DataFrame'>



RE: Pandas confused - jefsummers - Sep-18-2021

They could have made the default column names A, B, C like spreadsheets do, but once you get past 26 columns it works better to use numbers, IMHO.


RE: Pandas confused - DPaul - Sep-19-2021

Thing is, in this case I do not care about the headers,
but i want the row labels.
".reset_index" and "for column in df..." give me what i want.
Thanks for all the help.
Paul