Python Forum
Match two data sets based on item values
Thread Rating:
  • 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Match two data sets based on item values
#1
I have two data frames, which i want to combine usinga left join or similar.
PV number wise subtotal of the amount column of the dataframe (Pmt) has to be match with the second table (BankWithdrawal).
If such sub total matches then join that row of second table with the relevant first table rows. Resultant table with the other two tables attached.

Get the sub totals in PVNo wise from Pmt dataframe
subtotal=pmt[["PVNo.","Amount"]].groupby('PVNo.').sum()
Great if someone can help me to achive the resultant table.

Attached Files

.xlsx   Pmt.xlsx (Size: 8.5 KB / Downloads: 84)
.xlsx   BankWithdrawal.xlsx (Size: 8.19 KB / Downloads: 102)
.xlsx   ResultCombined.xlsx (Size: 8.66 KB / Downloads: 78)
Reply
#2
what are you summing? there are no numbers here
show a run-able snippet of code
Reply
#3
(Mar-28-2017, 07:35 PM)Larz60+ Wrote: what are you summing? there are no numbers here
show a run-able snippet of code

Thanks for your interest.

I'm not sure how to add data frames to this post. So i have attached 'pmt' dataframe & 'withdrawal' dataframe.

I'm taking the subtotal of Amount column for each value in 'PVno' column.

Now i have to match the amounts resulted in above with the 'B.amount' column in 'withdrawal' dataframe & make a combined dataframe.

Expected dataframe (Resultcombined) is also attached.
Reply
#4
You can do SQL-like joins on pandas dataframes with pandas.merge()
merged_dataframe = pd.merge(subtotal_dataframe, bankwithdrawal_dataframe)
You can specify what type of join with parameter how=, default is 'inner'; on default it uses common columns as keys for join, but you can specify it with on= parameter (or with left_on and right_on= if your keys are different).
Reply
#5
(Mar-28-2017, 07:55 PM)zivoni Wrote: You can do SQL-like joins on pandas dataframes with pandas.merge()
merged_dataframe = pd.merge(subtotal_dataframe, bankwithdrawal_dataframe)
You can specify what type of join with parameter how=, default is 'inner'; on default it uses common columns as keys for join, but you can specify it with on= parameter (or with left_on and right_on= if your keys are different).

Thanks for the reply. It works.
It seems common column('Amount') has to have same column name in both tables.
Reply
#6
You can use left_on='column_in_left_frame', right_on='column_in_right_frame' as a parameters for pd.merge.
EDIT: i guess that i misread your reply and it worked (same column)
Reply
#7
(Mar-28-2017, 08:19 PM)zivoni Wrote: You can use left_on='column_in_left_frame', right_on='column_in_right_frame' as a parameters for pd.merge.

Thanks, That's great.

Is there a way to add multiple common colums to match.
Let say i have another common column like 'Date' in both tables, where i have to match only if the 'date' in second tables is on or after the 'date' in table one.
Reply
#8
Perhaps just join it (without using Date as a key) and after that you can subset resulting dataframe (common columns not used as a key should have _x and _y appened).
df[df.Date_x <= df.Date_y]
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Supervised learning, tree based model - problems splitting data Pixel 0 644 May-16-2023, 05:25 PM
Last Post: Pixel
  Grouping Data based on 30% bracket purnima1 0 940 Feb-16-2023, 07:14 PM
Last Post: purnima1
  Make unique id in vectorized way based on text data column with similarity scoring ill8 0 861 Dec-12-2022, 03:22 AM
Last Post: ill8
  replace sets of values in an array without using loops paul18fr 7 1,629 Jun-20-2022, 08:15 PM
Last Post: paul18fr
  Data sets comparison Fraetos 0 1,383 Sep-14-2021, 06:45 AM
Last Post: Fraetos
  update values in one dataframe based on another dataframe - Pandas iliasb 2 9,099 Aug-14-2021, 12:38 PM
Last Post: jefsummers
  Pandas Data frame column condition check based on length of the value aditi06 1 2,655 Jul-28-2021, 11:08 AM
Last Post: jefsummers
  Mann Whitney U-test on several data sets rybina 2 2,050 Jan-05-2021, 03:08 PM
Last Post: rybina
  Dropping Rows From A Data Frame Based On A Variable JoeDainton123 1 2,186 Aug-03-2020, 02:05 AM
Last Post: scidam
  Filter data based on a value from another dataframe column and create a file using lo pawanmtm 1 4,242 Jul-15-2020, 06:20 PM
Last Post: pawanmtm

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020