Pandas confused - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Pandas confused (/thread-34943.html) |
Pandas confused - DPaul - Sep-18-2021 I use pandas occasionally to read excel files. I found out that I can also use it to calculate subtotals using group(...). Works fine. Except when i print the result of the subtotal after grouping, it displays 0,1... on top, are these indexes? Example sites don't seem comment on that, so I must be missing something. What ? thx, Paul import pandas as panda Q = [['AF-10', 744.42], ['AF-10', 1243.68], ['AF-20', 90.0], ['BA-40', -1425.0], ['BA-40', 1425.0], ['BA-40', 1425.0], ['BA-40', 1425.0], ['BA-50', 150.0], ['BO-30', 16514.61], ['BR-10', 528.35]] # etc.... Qdf = panda.DataFrame(Q) print(Qdf.groupby([0]).sum())
RE: Pandas confused - snippsat - Sep-18-2021 (Sep-18-2021, 07:26 AM)DPaul Wrote: Except when i print the resultNo only 1 are index use df.columns to see,have to reset the index.>>> df = Qdf.groupby([0]).sum() >>> df 1 0 AF-10 1988.10 AF-20 90.00 BA-40 2850.00 BA-50 150.00 BO-30 16514.61 BR-10 528.35 >>> df.columns Int64Index([1], dtype='int64') >>> >>> df = df.reset_index() >>> df.columns Int64Index([0, 1], dtype='int64') >>> df 0 1 0 AF-10 1988.10 1 AF-20 90.00 2 BA-40 2850.00 3 BA-50 150.00 4 BO-30 16514.61 5 BR-10 528.35 >>> df[0] 0 AF-10 1 AF-20 2 BA-40 3 BA-50 4 BO-30 5 BR-10 Name: 0, dtype: object RE: Pandas confused - DPaul - Sep-18-2021 Thanks for your speedy answer. Leaves me speechless. I wonder if somebody thought of the KISS rule, when he/she implemented this Paul RE: Pandas confused - DPaul - Sep-18-2021 This also seems to do the trick: Qlst = Qdf.values.tolist() print(Qlst) Paul RE: Pandas confused - snippsat - Sep-18-2021 Yes if the goal to take data out Pandas to a list then it will work,as column index don't get used. If take that list back into DataFrame then it will automatically give column index. >>> data = Qdf.values.tolist() >>> data [['AF-10', 744.42], ['AF-10', 1243.68], ['AF-20', 90.0], ['BA-40', -1425.0], ['BA-40', 1425.0], ['BA-40', 1425.0], ['BA-40', 1425.0], ['BA-50', 150.0], ['BO-30', 16514.61], ['BR-10', 528.35]] >>> type(data) # A normal Python list <class 'list'> >>> >>> import pandas as pd >>> >>> df = panda.DataFrame(data) >>> df 0 1 0 AF-10 744.42 1 AF-10 1243.68 2 AF-20 90.00 3 BA-40 -1425.00 4 BA-40 1425.00 5 BA-40 1425.00 6 BA-40 1425.00 7 BA-50 150.00 8 BO-30 16514.61 9 BR-10 528.35 >>> df = df.rename(columns={0: 'Name', 1: 'Cost'}) >>> df Name Cost 0 AF-10 744.42 1 AF-10 1243.68 2 AF-20 90.00 3 BA-40 -1425.00 4 BA-40 1425.00 5 BA-40 1425.00 6 BA-40 1425.00 7 BA-50 150.00 8 BO-30 16514.61 9 BR-10 528.35 >>> type(df) # A DataFrame <class 'pandas.core.frame.DataFrame'> RE: Pandas confused - jefsummers - Sep-18-2021 They could have made the default column names A, B, C like spreadsheets do, but once you get past 26 columns it works better to use numbers, IMHO. RE: Pandas confused - DPaul - Sep-19-2021 Thing is, in this case I do not care about the headers, but i want the row labels. ".reset_index" and "for column in df..." give me what i want. Thanks for all the help. Paul |