Python Forum
How to use pandas to compare two DataFrames having different structure - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: How to use pandas to compare two DataFrames having different structure (/thread-38270.html)



How to use pandas to compare two DataFrames having different structure - traja47 - Sep-24-2022

hi all
Please accept my apologies if this is not the correct forum for pandas.,
I am a new python pandas user and have a question.

I have two DataFrames as follows
df1 = pd.read_excel('TR_TRAUS_contacts.xlsx', index_col=0, dtype={'Name': str,'Market': str, 'Class': str,
'First Name': str, 'Last Name': str, 'Job Title': str, 'Email': str, 'Phone Number': str, 'Mobile Number': str, 'Street Address': str, 'City': str, 'State': str, 'Post Code': str})

df2 = pd.read_csv('Customers.csv', index_col = 0, dtype={'Name': str, 'Status': str, 'Currency':str, 'PaymentTerm': str, 'TaxRule': str, 'AccountReceivable': str, 'SaleAccount': str, 'PriceTier': str, 'Discount': str, 'CreditLimit': str, 'Carrier': str, 'SalesRepresentative': str, 'Location': str, 'TaxNumber': str, 'Tags': str, 'AttributeSet': str, 'AdditionalAttribute1': str, 'AdditionalAttribute2': str, 'AdditionalAttribute3':str, 'AdditionalAttribute4': str, 'AdditionalAttribute5': str, 'AdditionalAttribute6': str, 'AdditionalAttribute7': str, 'AdditionalAttribute8': str, 'AdditionalAttribute9': str, 'AdditionalAttribute10': str, 'Comments': str, 'ContactName': str, 'JobTitle': str, 'Phone': str, 'MobilePhone': str, 'Fax': str, 'Email': str, 'Website': str, 'ContactComment': str, 'ContactDefault': str, 'ContactIncludeInEmail': str, 'IsAccountingDimensionEnabled': str})

df1 and df2 columns are differenct
df2 and df12 columns are the same

I need to generate df12 (.csv) with some column values extracted from df1 based on a condition that only rows that exist in df1 but are inexistent in df2.
Index 'Name' colum exists in both df1 and df2

note:
I have achieved above using JavaScript[Image: df1.jpg][Image: df1df2df12.jpg] on Google Sheets, via iteration.
I have been told not to use Iteration in Pandas

thanks in advance for any help