Apr-07-2023, 03:34 AM
Like this?
Alternatively, I could pass the entire row and use column index names inside myfunc() to access the values.
import pandas as pd def myfunc(row): return row[0] + row[1], row[0] * row[1] df = pd.DataFrame({"A": range(5), "B": range(10, 15), "C": range(20, 25)}) df[["D", "E"]] = df[["A", "B"]].apply(myfunc, axis=1, result_type='expand') print(df)
Output: A B C D E
0 0 10 20 10 0
1 1 11 21 12 11
2 2 12 22 14 24
3 3 13 23 16 39
4 4 14 24 18 56
This part creates a dataframe that has columns "A" and "B".df[["A", "B"]]This says we are going to pass the rows of that dataframe (axis=1) to myfunc().
df[["A", "B"]].apply(myfunc, axis=1)myfunc() returns a tuple with two values. I want to expand this to two columns.
df[["A", "B"]].apply(myfunc, axis=1, result_type='expand')And I want to add these two new columns to df and call them "D" and "E"
df[["D", "E"]] = df[["A", "B"]].apply(myfunc, axis=1, result_type='expand')Notice the function arguments for myfunc() is a series (row) and I get the values of the series using an integer index instead of a index name ("A", "B").
def function(row): return row[0] + row[1], row[0] * row[1]This lets me apply the same function to any two columns. All I have to do is create a different dataframe (df["A", "C"] for example) that is used to supply the rows.
Alternatively, I could pass the entire row and use column index names inside myfunc() to access the values.
import pandas as pd def function(row): return row["A"] + row["B"], row["A"] * row["B"] df = pd.DataFrame({"A": range(5), "B": range(10, 15), "C": range(20, 25)}) df[["D", "E"]] = df.apply(function, axis=1, result_type='expand') print(df)The result is the same. As is this where I pass the column index names as additional arguments.
import pandas as pd def function(row, a, b): return row[a] + row[b], row[a] * row[b] df = pd.DataFrame({"A": range(5), "B": range(10, 15), "C": range(20, 25)}) df[["D", "E"]] = df.apply(function, args=("A", "B"), axis=1, result_type='expand') print(df)