Python Forum
Match two data sets based on item values
Thread Rating:
  • 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Match two data sets based on item values
#1
I have two data frames, which i want to combine usinga left join or similar.
PV number wise subtotal of the amount column of the dataframe (Pmt) has to be match with the second table (BankWithdrawal).
If such sub total matches then join that row of second table with the relevant first table rows. Resultant table with the other two tables attached.

Get the sub totals in PVNo wise from Pmt dataframe
subtotal=pmt[["PVNo.","Amount"]].groupby('PVNo.').sum()
Great if someone can help me to achive the resultant table.

Attached Files

.xlsx   Pmt.xlsx (Size: 8.5 KB / Downloads: 84)
.xlsx   BankWithdrawal.xlsx (Size: 8.19 KB / Downloads: 102)
.xlsx   ResultCombined.xlsx (Size: 8.66 KB / Downloads: 78)
Reply
#2
what are you summing? there are no numbers here
show a run-able snippet of code
Reply
#3
(Mar-28-2017, 07:35 PM)Larz60+ Wrote: what are you summing? there are no numbers here
show a run-able snippet of code

Thanks for your interest.

I'm not sure how to add data frames to this post. So i have attached 'pmt' dataframe & 'withdrawal' dataframe.

I'm taking the subtotal of Amount column for each value in 'PVno' column.

Now i have to match the amounts resulted in above with the 'B.amount' column in 'withdrawal' dataframe & make a combined dataframe.

Expected dataframe (Resultcombined) is also attached.
Reply
#4
You can do SQL-like joins on pandas dataframes with pandas.merge()
merged_dataframe = pd.merge(subtotal_dataframe, bankwithdrawal_dataframe)
You can specify what type of join with parameter how=, default is 'inner'; on default it uses common columns as keys for join, but you can specify it with on= parameter (or with left_on and right_on= if your keys are different).
Reply
#5
(Mar-28-2017, 07:55 PM)zivoni Wrote: You can do SQL-like joins on pandas dataframes with pandas.merge()
merged_dataframe = pd.merge(subtotal_dataframe, bankwithdrawal_dataframe)
You can specify what type of join with parameter how=, default is 'inner'; on default it uses common columns as keys for join, but you can specify it with on= parameter (or with left_on and right_on= if your keys are different).

Thanks for the reply. It works.
It seems common column('Amount') has to have same column name in both tables.
Reply
#6
You can use left_on='column_in_left_frame', right_on='column_in_right_frame' as a parameters for pd.merge.
EDIT: i guess that i misread your reply and it worked (same column)
Reply
#7
(Mar-28-2017, 08:19 PM)zivoni Wrote: You can use left_on='column_in_left_frame', right_on='column_in_right_frame' as a parameters for pd.merge.

Thanks, That's great.

Is there a way to add multiple common colums to match.
Let say i have another common column like 'Date' in both tables, where i have to match only if the 'date' in second tables is on or after the 'date' in table one.
Reply
#8
Perhaps just join it (without using Date as a key) and after that you can subset resulting dataframe (common columns not used as a key should have _x and _y appened).
df[df.Date_x <= df.Date_y]
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  replace sets of values in an array without using loops paul18fr 7 669 Jun-20-2022, 08:15 PM
Last Post: paul18fr
  Data sets comparison Fraetos 0 876 Sep-14-2021, 06:45 AM
Last Post: Fraetos
  update values in one dataframe based on another dataframe - Pandas iliasb 2 5,928 Aug-14-2021, 12:38 PM
Last Post: jefsummers
  Pandas Data frame column condition check based on length of the value aditi06 1 1,744 Jul-28-2021, 11:08 AM
Last Post: jefsummers
  Mann Whitney U-test on several data sets rybina 2 1,412 Jan-05-2021, 03:08 PM
Last Post: rybina
  Dropping Rows From A Data Frame Based On A Variable JoeDainton123 1 1,581 Aug-03-2020, 02:05 AM
Last Post: scidam
  Filter data based on a value from another dataframe column and create a file using lo pawanmtm 1 3,482 Jul-15-2020, 06:20 PM
Last Post: pawanmtm
  Least-squares fit multiple data sets multiverse22 1 1,782 Jun-06-2020, 01:38 AM
Last Post: Larz60+
  Identifying consecutive masked values in a 3D data array chai0404 12 4,153 Feb-01-2020, 12:59 PM
Last Post: perfringo
  Help with Data Match Theory randor 2 1,466 Dec-25-2019, 05:57 PM
Last Post: randor

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020