Bottom Page

Thread Rating:
  • 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Match two data sets based on item values
#1
I have two data frames, which i want to combine usinga left join or similar.
PV number wise subtotal of the amount column of the dataframe (Pmt) has to be match with the second table (BankWithdrawal).
If such sub total matches then join that row of second table with the relevant first table rows. Resultant table with the other two tables attached.

Get the sub totals in PVNo wise from Pmt dataframe
subtotal=pmt[["PVNo.","Amount"]].groupby('PVNo.').sum()
Great if someone can help me to achive the resultant table.


Attached Files
.xlsx   Pmt.xlsx (Size: 8.5 KB / Downloads: 84)
.xlsx   BankWithdrawal.xlsx (Size: 8.19 KB / Downloads: 102)
.xlsx   ResultCombined.xlsx (Size: 8.66 KB / Downloads: 78)
Quote
#2
what are you summing? there are no numbers here
show a run-able snippet of code
Quote
#3
(Mar-28-2017, 07:35 PM)Larz60+ Wrote: what are you summing? there are no numbers here
show a run-able snippet of code

Thanks for your interest.

I'm not sure how to add data frames to this post. So i have attached 'pmt' dataframe & 'withdrawal' dataframe.

I'm taking the subtotal of Amount column for each value in 'PVno' column.

Now i have to match the amounts resulted in above with the 'B.amount' column in 'withdrawal' dataframe & make a combined dataframe.

Expected dataframe (Resultcombined) is also attached.
Quote
#4
You can do SQL-like joins on pandas dataframes with pandas.merge()
merged_dataframe = pd.merge(subtotal_dataframe, bankwithdrawal_dataframe)
You can specify what type of join with parameter how=, default is 'inner'; on default it uses common columns as keys for join, but you can specify it with on= parameter (or with left_on and right_on= if your keys are different).
klllmmm likes this post
Quote
#5
(Mar-28-2017, 07:55 PM)zivoni Wrote: You can do SQL-like joins on pandas dataframes with pandas.merge()
merged_dataframe = pd.merge(subtotal_dataframe, bankwithdrawal_dataframe)
You can specify what type of join with parameter how=, default is 'inner'; on default it uses common columns as keys for join, but you can specify it with on= parameter (or with left_on and right_on= if your keys are different).

Thanks for the reply. It works.
It seems common column('Amount') has to have same column name in both tables.
Quote
#6
You can use left_on='column_in_left_frame', right_on='column_in_right_frame' as a parameters for pd.merge.
EDIT: i guess that i misread your reply and it worked (same column)
Quote
#7
(Mar-28-2017, 08:19 PM)zivoni Wrote: You can use left_on='column_in_left_frame', right_on='column_in_right_frame' as a parameters for pd.merge.

Thanks, That's great.

Is there a way to add multiple common colums to match.
Let say i have another common column like 'Date' in both tables, where i have to match only if the 'date' in second tables is on or after the 'date' in table one.
Quote
#8
Perhaps just join it (without using Date as a key) and after that you can subset resulting dataframe (common columns not used as a key should have _x and _y appened).
df[df.Date_x <= df.Date_y]
klllmmm likes this post
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Sample based on the distribution of a feature to create more balanced data set dervast 0 68 Sep-25-2019, 05:56 PM
Last Post: dervast
  Clustering for imbalanced data sets dervast 0 70 Sep-25-2019, 06:34 AM
Last Post: dervast
  how to get x values based on y-axis values from curvefit function python_newbie09 1 315 Sep-19-2019, 02:09 AM
Last Post: scidam
  select data based on indice Staph 4 223 Jul-15-2019, 02:05 AM
Last Post: scidam
  Grouping data based on rolling conditions kapilan15 0 217 Jun-05-2019, 01:07 PM
Last Post: kapilan15
  Match a table with sub-total values to its detail value table klllmmm 2 1,289 Apr-03-2019, 11:28 AM
Last Post: klllmmm
  Time Data does not match format AshBax 2 6,478 Nov-13-2018, 12:19 PM
Last Post: AshBax
  Splitting values in column in a pandas dataframe based on a condition hey_arnold 1 1,402 Jul-24-2018, 02:18 PM
Last Post: hey_arnold
  Get max values based on unique values in another list - python Antonio 8 1,765 Jun-12-2018, 07:49 PM
Last Post: Mekire
  Compare 2 Csv data sets, identify record with latest date MJUk 11 1,807 Jan-06-2018, 09:23 PM
Last Post: MJUk

Forum Jump:


Users browsing this thread: 1 Guest(s)