Posts: 741
Threads: 122
Joined: Dec 2017
Sep-18-2021, 07:26 AM
(This post was last modified: Sep-18-2021, 07:26 AM by DPaul.)
I use pandas occasionally to read excel files.
I found out that I can also use it to calculate subtotals
using group(...). Works fine. Except when i print the result
of the subtotal after grouping, it displays 0,1... on top,
are these indexes? Example sites don't seem comment on that,
so I must be missing something. What ?
thx,
Paul
import pandas as panda
Q = [['AF-10', 744.42],
['AF-10', 1243.68],
['AF-20', 90.0],
['BA-40', -1425.0],
['BA-40', 1425.0],
['BA-40', 1425.0],
['BA-40', 1425.0],
['BA-50', 150.0],
['BO-30', 16514.61],
['BR-10', 528.35]] # etc....
Qdf = panda.DataFrame(Q)
print(Qdf.groupby([0]).sum()) Output: 1
0
AF-10 1988.10
AF-20 90.00
BA-40 2850.00
BA-50 150.00
BO-30 16514.61
BR-10 528.35
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Posts: 7,312
Threads: 123
Joined: Sep 2016
(Sep-18-2021, 07:26 AM)DPaul Wrote: Except when i print the result
of the subtotal after grouping, it displays 0,1... on top,
are these indexes? No only 1 are index use df.columns to see,have to reset the index.
>>> df = Qdf.groupby([0]).sum()
>>> df
1
0
AF-10 1988.10
AF-20 90.00
BA-40 2850.00
BA-50 150.00
BO-30 16514.61
BR-10 528.35
>>> df.columns
Int64Index([1], dtype='int64')
>>>
>>> df = df.reset_index()
>>> df.columns
Int64Index([0, 1], dtype='int64')
>>> df
0 1
0 AF-10 1988.10
1 AF-20 90.00
2 BA-40 2850.00
3 BA-50 150.00
4 BO-30 16514.61
5 BR-10 528.35
>>> df[0]
0 AF-10
1 AF-20
2 BA-40
3 BA-50
4 BO-30
5 BR-10
Name: 0, dtype: object
Posts: 741
Threads: 122
Joined: Dec 2017
Thanks for your speedy answer. Leaves me speechless.
I wonder if somebody thought of the KISS rule,
when he/she implemented this
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Posts: 741
Threads: 122
Joined: Dec 2017
This also seems to do the trick:
Qlst = Qdf.values.tolist()
print(Qlst)
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Posts: 7,312
Threads: 123
Joined: Sep 2016
Sep-18-2021, 10:46 AM
(This post was last modified: Sep-18-2021, 10:46 AM by snippsat.)
Yes if the goal to take data out Pandas to a list then it will work,as column index don't get used.
If take that list back into DataFrame then it will automatically give column index.
>>> data = Qdf.values.tolist()
>>> data
[['AF-10', 744.42],
['AF-10', 1243.68],
['AF-20', 90.0],
['BA-40', -1425.0],
['BA-40', 1425.0],
['BA-40', 1425.0],
['BA-40', 1425.0],
['BA-50', 150.0],
['BO-30', 16514.61],
['BR-10', 528.35]]
>>> type(data) # A normal Python list
<class 'list'>
>>>
>>> import pandas as pd
>>>
>>> df = panda.DataFrame(data)
>>> df
0 1
0 AF-10 744.42
1 AF-10 1243.68
2 AF-20 90.00
3 BA-40 -1425.00
4 BA-40 1425.00
5 BA-40 1425.00
6 BA-40 1425.00
7 BA-50 150.00
8 BO-30 16514.61
9 BR-10 528.35
>>> df = df.rename(columns={0: 'Name', 1: 'Cost'})
>>> df
Name Cost
0 AF-10 744.42
1 AF-10 1243.68
2 AF-20 90.00
3 BA-40 -1425.00
4 BA-40 1425.00
5 BA-40 1425.00
6 BA-40 1425.00
7 BA-50 150.00
8 BO-30 16514.61
9 BR-10 528.35
>>> type(df) # A DataFrame
<class 'pandas.core.frame.DataFrame'>
Posts: 1,358
Threads: 2
Joined: May 2019
They could have made the default column names A, B, C like spreadsheets do, but once you get past 26 columns it works better to use numbers, IMHO.
Posts: 741
Threads: 122
Joined: Dec 2017
Thing is, in this case I do not care about the headers,
but i want the row labels.
".reset_index" and "for column in df..." give me what i want.
Thanks for all the help.
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
|