Python Forum
Add two resultant fields from python lambda function - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Add two resultant fields from python lambda function (/thread-40105.html)



Add two resultant fields from python lambda function - klllmmm - Jun-03-2023

I have a dataframe where i would like to add two fields at once based on a condition. I can achieve this by defining a function.

import pandas as pd
df1 = pd.DataFrame(data = {'name':["Total",'Tozi Ford','Susan Mock','Donale Fucci'],
                               'store_label':["np.nan",'Merchant_A','Merchant_B','Merchant_C'],
                               "earned_amount":[4300,1000,300,3000],
                               "earned_amount1":[5300,2000,600,6000],
                                })
By defining a function
def my_function(x):
    amount1 = x["earned_amount"] if ((x['store_label'] == 'Merchant_A') |(x['store_label'] == 'Merchant_B')) else np.nan
    amount2 = x["earned_amount1"] if ((x['store_label'] == 'Merchant_A') |(x['store_label'] == 'Merchant_B')) else np.nan
    return amount1, amount2


df1[['amount1', 'amount2']] = df1.apply(my_function, axis=1, result_type="expand")
Output:
df1 Out[232]: name store_label earned_amount earned_amount1 amount1 amount2 0 Total np.nan 4300 5300 NaN NaN 1 Tozi Ford Merchant_A 1000 2000 1000.0 2000.0 2 Susan Mock Merchant_B 300 600 300.0 600.0 3 Donale Fucci Merchant_C 3000 6000 NaN NaN
Is it possible to do the same with out separately defining a function, but using conditions within a lambda statement?

df1[['amount1', 'amount2']] = zip(df1.apply(lambda x : ( x["earned_amount"],x["earned_amount1"]) if ((x['store_label'] == 'Merchant_A') |(x['store_label'] == 'Merchant_B'))
                                                                       else np.nan, axis = 1 ))
Error:
Traceback (most recent call last): File "C:\Users\KP\AppData\Local\Temp\ipykernel_25588\41625259.py", line 1, in <module> df1[['amount1', 'amount2']] = zip(df1.apply(lambda x : ( x["earned_amount"],x["earned_amount1"]) if ((x['store_label'] == 'Merchant_A') |(x['store_label'] == 'Merchant_B')) File "C:\Users\KP\Anaconda3\lib\site-packages\pandas\core\frame.py", line 3966, in __setitem__ self._setitem_array(key, value) File "C:\Users\KP\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4025, in _setitem_array self._iset_not_inplace(key, value) File "C:\Users\KP\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4043, in _iset_not_inplace if np.shape(value)[-1] != len(key): IndexError: tuple index out of range
Any solutions to achieve this is really appreciate.


RE: Add two resultant fields from python lambda function - Gribouillis - Jun-03-2023

There is no difference between using a function and using a lambda expression. In the above lambda expression, I think the else part should be else (np.nan, np.nan). Also what does the zip() do and why did you remove the result_type="expand" ?


RE: Add two resultant fields from python lambda function - klllmmm - Jun-03-2023

(Jun-03-2023, 11:52 AM)Gribouillis Wrote: There is no difference between using a function and using a lambda expression. In the above lambda expression, I think the else part should be else (np.nan, np.nan). Also what does the zip() do and why did you remove the result_type="expand" ?

Thanks @Gribouillis
df1[['amount1', 'amount2']] = df1.apply(lambda x : ( x["earned_amount"],x["earned_amount1"]) if ((x['store_label'] == 'Merchant_A') |(x['store_label'] == 'Merchant_B'))
                                                                       else (np.nan, np.nan), axis = 1 , result_type="expand")
Output:
df1 Out[235]: name store_label earned_amount earned_amount1 amount1 amount2 0 Total np.nan 4300 5300 NaN NaN 1 Tozi Ford Merchant_A 1000 2000 1000.0 2000.0 2 Susan Mock Merchant_B 300 600 300.0 600.0 3 Donale Fucci Merchant_C 3000 6000 NaN NaN



RE: Add two resultant fields from python lambda function - deanhystad - Jun-03-2023

I would do it like this:
import pandas as pd
import numpy as np


df1 = pd.DataFrame(
    data = {
        'name':["Total",'Tozi Ford','Susan Mock','Donale Fucci'],
        'store_label':[np.nan,'Merchant_A','Merchant_B','Merchant_C'],
        "earned_amount":[4300,1000,300,3000],
        "earned_amount1":[5300,2000,600,6000],
    }
)

# Select the rows of approved merchants
merchants = df1["store_label"].isin(["Merchant_A", "Merchant_B"])
print(merchants)

# Create New columns
df1["amount"] = df1.loc[merchants, "earned_amount"]
df1["amount1"] = df1.loc[merchants, "earned_amount1"]
print(df1)
Output:
0 False 1 True 2 True 3 False Name: store_label, dtype: bool name store_label earned_amount earned_amount1 amount amount1 0 Total NaN 4300 5300 NaN NaN 1 Tozi Ford Merchant_A 1000 2000 1000.0 2000.0 2 Susan Mock Merchant_B 300 600 300.0 600.0 3 Donale Fucci Merchant_C 3000 6000 NaN NaN
Or if you really need to create two new colums with one command
# Create New columns
df1[["amount", "amount1"]] = df1.loc[merchants, ["earned_amount", "earned_amount1"]]
Or if you prefer code that is compact but hard to read.
df1 = pd.DataFrame(
    data = {
        'name':["Total",'Tozi Ford','Susan Mock','Donale Fucci'],
        'store_label':[np.nan,'Merchant_A','Merchant_B','Merchant_C'],
        "earned_amount":[4300,1000,300,3000],
        "earned_amount1":[5300,2000,600,6000],
    }
)

df1[["amount", "amount1"]] = df1.loc[
    df1["store_label"].isin(["Merchant_A", "Merchant_B"]),
    ["earned_amount", "earned_amount1"]]



RE: Add two resultant fields from python lambda function - rajeshgk - Jun-06-2023

import pandas as pd

df1 = pd.DataFrame(data={
'name': ["Total", 'Tozi Ford', 'Susan Mock', 'Donale Fucci'],
'store_label': ["np.nan", 'Merchant_A', 'Merchant_B', 'Merchant_C'],
'earned_amount': [4300, 1000, 300, 3000],
'earned_amount1': [5300, 2000, 600, 6000]
})

# Use lambda function with conditions to create a new column
df1['earned_amount2'] = df1.apply(lambda row: row['earned_amount'] + row['earned_amount1']
if row['store_label'] != "np.nan" else None, axis=1)

print(df1)