Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Compare between 2 DataFrames
#1
Hi All,

I am a newbie in python and got stuck. Actually i want to select the data after comparing 2 data frames i.e. want to select all the records for matched Consumer_No from df2 for all the records whose date is lesser or equal to df2['Date']


df1 = {Consumer_No :[1011,1012,1013,1014,1011,1012,1013,1014,1011,1012,1013,1014,1011,1012,1013,1014],
       Date:['31/01/2019','31/01/2019','31/01/2019','31/01/2019','28/02/2019','28/02/2019','28/02/2019','28/02/2019','31/03/2019','31/03/2019','31/03/2019','31/03/2019','30/04/2019','30/04/2019','30/04/2019','30/04/2019']
      }

df2 = {Consumer_No : [1013,1011],Date:['24/03/2019','29/04/2019']}

if df1['Consumer_No']== df2['Consumer_No']:
     if df1['Date'] >= df2['Date']:
        matched = pd.merged(df1,df2, on='Consumer_No')
Please help me out in the same.
Quote
#2
Something wrong with this code. df1 and df2 are not Pandas data frames,
they are Python dictionaries.
I don't understand a second condition, are you requiring that df1.Date date is greater (later)
than at least one date in df2.Date? These dataframes have non-equal sizes, so we cannot perform
elementwise comparison.

Nevertheless, try the following, this is probably is what you are looking for:

import pandas as pd
df1 = pd.DataFrame({'Consumer_No' :[1011,1012,1013,1014,1011,1012,1013,1014,1011,1012,1013,1014,1011,1012,1013,1014],
       'Date':['31/01/2019','31/01/2019','31/01/2019','31/01/2019','28/02/2019','28/02/2019','28/02/2019','28/02/2019','31/03/2019','31/03/2019','31/03/2019','31/03/2019','30/04/2019','30/04/2019','30/04/2019','30/04/2019']
      })
 
df2 = pd.DataFrame({'Consumer_No' : [1013,1011],'Date':['24/03/2019','29/04/2019']})

df1.Date =pd.to_datetime(df1.Date)
df2.Date =pd.to_datetime(df2.Date)

first_condition = df1.Consumer_No.isin(df2.Consumer_No)
second_condition = df1.Date > min(df2.Date)
result = df1.loc[first_condition & second_condition]
Quote
#3
Thank you very much scidam for your reply. Actually i am reading 2 csv files which results 2 different dataframes df1 & df2. Well, df1 consists monthly data while df2 consists the information about the check performed on particular user.So, I used second condition because i want to extract all the prior records of a matched consumers from the date mentioned in df2.

(Jul-25-2019, 11:51 PM)scidam Wrote: Something wrong with this code. df1 and df2 are not Pandas data frames,
they are Python dictionaries.
I don't understand a second condition, are you requiring that df1.Date date is greater (later)
than at least one date in df2.Date? These dataframes have non-equal sizes, so we cannot perform
elementwise comparison.

Nevertheless, try the following, this is probably is what you are looking for:

import pandas as pd
df1 = pd.DataFrame({'Consumer_No' :[1011,1012,1013,1014,1011,1012,1013,1014,1011,1012,1013,1014,1011,1012,1013,1014],
       'Date':['31/01/2019','31/01/2019','31/01/2019','31/01/2019','28/02/2019','28/02/2019','28/02/2019','28/02/2019','31/03/2019','31/03/2019','31/03/2019','31/03/2019','30/04/2019','30/04/2019','30/04/2019','30/04/2019']
      })
 
df2 = pd.DataFrame({'Consumer_No' : [1013,1011],'Date':['24/03/2019','29/04/2019']})

df1.Date =pd.to_datetime(df1.Date)
df2.Date =pd.to_datetime(df2.Date)

first_condition = df1.Consumer_No.isin(df2.Consumer_No)
second_condition = df1.Date > min(df2.Date)
result = df1.loc[first_condition & second_condition]
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Creating A List of DataFrames & Manipulating Columns in Each DataFrame firebird 1 204 Jul-31-2019, 04:04 AM
Last Post: scidam
  Giving index when joining dataframes kw42chan 1 356 Jul-06-2019, 06:19 AM
Last Post: kw42chan
  Could anyone help me get the jaccard distance between my dataframes please? :) a_real_phoenix 0 421 Jun-27-2019, 06:01 PM
Last Post: a_real_phoenix
  Two dataframes merged Ecniv 10 749 Jun-16-2019, 09:10 PM
Last Post: Ecniv
  Statistical analysis of two dataframes zhl 1 549 Jun-11-2019, 07:26 PM
Last Post: Ecniv
  Interpolate using multiple dataframes Lastwizzle 0 299 May-29-2019, 05:32 PM
Last Post: Lastwizzle
  Sum product multiple Dataframes based on column headers. Lastwizzle 0 579 May-21-2019, 04:05 PM
Last Post: Lastwizzle
  Why can't I merge pandas dataframes learnpython2018 2 1,061 Sep-23-2018, 05:53 PM
Last Post: learnpython2018

Forum Jump:


Users browsing this thread: 1 Guest(s)