Python Forum
Match two data sets based on item values - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Match two data sets based on item values (/thread-2613.html)



Match two data sets based on item values - klllmmm - Mar-28-2017

I have two data frames, which i want to combine usinga left join or similar.
PV number wise subtotal of the amount column of the dataframe (Pmt) has to be match with the second table (BankWithdrawal).
If such sub total matches then join that row of second table with the relevant first table rows. Resultant table with the other two tables attached.

Get the sub totals in PVNo wise from Pmt dataframe
subtotal=pmt[["PVNo.","Amount"]].groupby('PVNo.').sum()
Great if someone can help me to achive the resultant table.


RE: Match two data sets based on item values - Larz60+ - Mar-28-2017

what are you summing? there are no numbers here
show a run-able snippet of code


RE: Match two data sets based on item values - klllmmm - Mar-28-2017

(Mar-28-2017, 07:35 PM)Larz60+ Wrote: what are you summing? there are no numbers here
show a run-able snippet of code

Thanks for your interest.

I'm not sure how to add data frames to this post. So i have attached 'pmt' dataframe & 'withdrawal' dataframe.

I'm taking the subtotal of Amount column for each value in 'PVno' column.

Now i have to match the amounts resulted in above with the 'B.amount' column in 'withdrawal' dataframe & make a combined dataframe.

Expected dataframe (Resultcombined) is also attached.


RE: Match two data sets based on item values - zivoni - Mar-28-2017

You can do SQL-like joins on pandas dataframes with pandas.merge()
merged_dataframe = pd.merge(subtotal_dataframe, bankwithdrawal_dataframe)
You can specify what type of join with parameter how=, default is 'inner'; on default it uses common columns as keys for join, but you can specify it with on= parameter (or with left_on and right_on= if your keys are different).


RE: Match two data sets based on item values - klllmmm - Mar-28-2017

(Mar-28-2017, 07:55 PM)zivoni Wrote: You can do SQL-like joins on pandas dataframes with pandas.merge()
merged_dataframe = pd.merge(subtotal_dataframe, bankwithdrawal_dataframe)
You can specify what type of join with parameter how=, default is 'inner'; on default it uses common columns as keys for join, but you can specify it with on= parameter (or with left_on and right_on= if your keys are different).

Thanks for the reply. It works.
It seems common column('Amount') has to have same column name in both tables.


RE: Match two data sets based on item values - zivoni - Mar-28-2017

You can use left_on='column_in_left_frame', right_on='column_in_right_frame' as a parameters for pd.merge.
EDIT: i guess that i misread your reply and it worked (same column)


RE: Match two data sets based on item values - klllmmm - Mar-28-2017

(Mar-28-2017, 08:19 PM)zivoni Wrote: You can use left_on='column_in_left_frame', right_on='column_in_right_frame' as a parameters for pd.merge.

Thanks, That's great.

Is there a way to add multiple common colums to match.
Let say i have another common column like 'Date' in both tables, where i have to match only if the 'date' in second tables is on or after the 'date' in table one.


RE: Match two data sets based on item values - zivoni - Mar-29-2017

Perhaps just join it (without using Date as a key) and after that you can subset resulting dataframe (common columns not used as a key should have _x and _y appened).
df[df.Date_x <= df.Date_y]