attempt to split values from within a dataframe column

**deanhystad** · Apr-07-2023, 03:34 AM

Like this?

import pandas as pd


def myfunc(row):
    return row[0] + row[1], row[0] * row[1]


df = pd.DataFrame({"A": range(5), "B": range(10, 15), "C": range(20, 25)})
df[["D", "E"]] = df[["A", "B"]].apply(myfunc, axis=1, result_type='expand')
print(df)

Output:   A   B   C   D   E
0  0  10  20  10   0
1  1  11  21  12  11
2  2  12  22  14  24
3  3  13  23  16  39
4  4  14  24  18  56

This part creates a dataframe that has columns "A" and "B".

df[["A", "B"]]

This says we are going to pass the rows of that dataframe (axis=1) to myfunc().

df[["A", "B"]].apply(myfunc, axis=1)

myfunc() returns a tuple with two values. I want to expand this to two columns.

df[["A", "B"]].apply(myfunc, axis=1, result_type='expand')

And I want to add these two new columns to df and call them "D" and "E"

df[["D", "E"]] = df[["A", "B"]].apply(myfunc, axis=1, result_type='expand')

Notice the function arguments for myfunc() is a series (row) and I get the values of the series using an integer index instead of a index name ("A", "B").

def function(row):
    return row[0] + row[1], row[0] * row[1]

This lets me apply the same function to any two columns. All I have to do is create a different dataframe (df["A", "C"] for example) that is used to supply the rows.

Alternatively, I could pass the entire row and use column index names inside myfunc() to access the values.

import pandas as pd


def function(row):
    return row["A"] + row["B"], row["A"] * row["B"]


df = pd.DataFrame({"A": range(5), "B": range(10, 15), "C": range(20, 25)})
df[["D", "E"]] = df.apply(function, axis=1, result_type='expand')
print(df)

The result is the same. As is this where I pass the column index names as additional arguments.

import pandas as pd


def function(row, a, b):
    return row[a] + row[b], row[a] * row[b]


df = pd.DataFrame({"A": range(5), "B": range(10, 15), "C": range(20, 25)})
df[["D", "E"]] = df.apply(function, args=("A", "B"), axis=1, result_type='expand')
print(df)

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	concat 3 columns of dataframe to one column	flash77	2	930	Oct-03-2023, 09:29 PM Last Post: flash77
	HTML Decoder pandas dataframe column	mbrown009	3	1,140	Sep-29-2023, 05:56 PM Last Post: deanhystad
	Increase df column values decimals	SriRajesh	2	1,151	Nov-14-2022, 05:20 PM Last Post: deanhystad
	New Dataframe Column Based on Several Conditions	nb1214	1	1,852	Nov-16-2021, 10:52 PM Last Post: jefsummers
	pandas: Compute the % of the unique values in a column	JaneTan	1	1,828	Oct-25-2021, 07:55 PM Last Post: jefsummers
	Putting column name to dataframe, can't work.	jonah88888	1	1,875	Sep-28-2021, 07:45 PM Last Post: deanhystad
	Remove specific values from dataframe	jonah88888	0	1,742	Sep-24-2021, 05:09 AM Last Post: jonah88888
	update values in one dataframe based on another dataframe - Pandas	iliasb	2	9,479	Aug-14-2021, 12:38 PM Last Post: jefsummers
	Setting the x-axis to a specific column in a dataframe	devansing	0	2,083	May-23-2021, 12:11 AM Last Post: devansing
	[Solved] How to refer to dataframe column name based on a list	lorensa74	1	2,346	May-17-2021, 07:02 AM Last Post: lorensa74

attempt to split values from within a dataframe column

User Panel Messages

Announcements