Create new column in dataframe

**deanhystad** · Jun-21-2024, 10:36 PM

I cannot turn this:

Output:Date id class amount
1/12/2002 1 L 5
1/12/2002 1 H 6
1/12/2002 1 L 7
1/12/2002 2 L 3
1/01/2003 1 L 2
1/01/2003 1 H 8
1/01/2003 2 H 7
1/01/2003 2 L 9
1/01/2003 2 L 1

Into this:

Output:Date id L_class H_class
1/12/2002 1 12 6
1/12/2002 2 3
1/01/2003 1 2 8
1/01/2003 2 10 7

But I can get close:

import pandas as pd
from io import StringIO


data = """Date id class amount
1/12/2002 1 L 5
1/12/2002 1 H 6
1/12/2002 1 L 7
1/12/2002 2 L 3
1/01/2003 1 L 2
1/01/2003 1 H 8
1/01/2003 2 H 7
1/01/2003 2 L 9
1/01/2003 2 L 1
"""
df = pd.read_csv(StringIO(data), sep=" ")
print("It is easy to go from this", df, sep="\n")

df2 = df.groupby(["Date", "id", "class"]).agg({"amount": "sum"})
print("\nTo this", df2, sep="\n")

print("\nor this.", df2.reset_index(), sep="\n")

df3 = pd.pivot_table(df, index=["Date", "id"], columns=["class"], values=["amount"])
print("\nBut neither of those have H and L as columns.  Maybe a pivot table is a better idea.", df3, sep="\n")

df4 = df3.reset_index()
print(
    "\nNow we have H and L columns, but both the rows and columns are multi-index.  Flatten the row index.",
    df4,
    sep="\n",
)

df4.columns = ["Date", "id", "H", "L"]
print("\nAnd flatten the column index.", df4, sep="\n")

Output:It is easy to go from this
        Date  id class  amount
0  1/12/2002   1     L       5
1  1/12/2002   1     H       6
2  1/12/2002   1     L       7
3  1/12/2002   2     L       3
4  1/01/2003   1     L       2
5  1/01/2003   1     H       8
6  1/01/2003   2     H       7
7  1/01/2003   2     L       9
8  1/01/2003   2     L       1

To this
                    amount
Date      id class
1/01/2003 1  H           8
             L           2
          2  H           7
             L          10
1/12/2002 1  H           6
             L          12
          2  L           3

or this.
        Date  id class  amount
0  1/01/2003   1     H       8
1  1/01/2003   1     L       2
2  1/01/2003   2     H       7
3  1/01/2003   2     L      10
4  1/12/2002   1     H       6
5  1/12/2002   1     L      12
6  1/12/2002   2     L       3

But neither of those have H and L as columns.  Maybe a pivot table is a better idea.
             amount
class             H    L
Date      id
1/01/2003 1     8.0  2.0
          2     7.0  5.0
1/12/2002 1     6.0  6.0
          2     NaN  3.0

Now we have H and L columns, but both the rows and columns are multi-index.  Flatten the row index.
            Date id amount
class                    H    L
0      1/01/2003  1    8.0  2.0
1      1/01/2003  2    7.0  5.0
2      1/12/2002  1    6.0  6.0
3      1/12/2002  2    NaN  3.0

And flatten the column index.
        Date  id    H    L
0  1/01/2003   1  8.0  2.0
1  1/01/2003   2  7.0  5.0
2  1/12/2002   1  6.0  6.0
3  1/12/2002   2  NaN  3.0

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	attempt to split values from within a dataframe column	mbrown009	9	2,782	Jun-20-2024, 07:59 PM Last Post: AdamHensley
	Putting column name to dataframe, can't work.	jonah88888	2	2,052	Jun-18-2024, 09:19 PM Last Post: AdamHensley
	concat 3 columns of dataframe to one column	flash77	2	1,064	Oct-03-2023, 09:29 PM Last Post: flash77
	HTML Decoder pandas dataframe column	mbrown009	3	1,283	Sep-29-2023, 05:56 PM Last Post: deanhystad
	New Dataframe Column Based on Several Conditions	nb1214	1	1,933	Nov-16-2021, 10:52 PM Last Post: jefsummers
	Setting the x-axis to a specific column in a dataframe	devansing	0	2,152	May-23-2021, 12:11 AM Last Post: devansing
	[Solved] How to refer to dataframe column name based on a list	lorensa74	1	2,436	May-17-2021, 07:02 AM Last Post: lorensa74
	Pandas - Creating additional column in dataframe from another column	Azureaus	2	3,132	Jan-11-2021, 09:53 PM Last Post: Azureaus
	Filter data based on a value from another dataframe column and create a file using lo	pawanmtm	1	4,422	Jul-15-2020, 06:20 PM Last Post: pawanmtm
	Pandas DataFrame and unmatched column	sritsv19	0	3,155	Jul-07-2020, 12:52 PM Last Post: sritsv19

Create new column in dataframe

User Panel Messages

Announcements