Compare two Excel sheets with Python and list diffenrences - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Compare two Excel sheets with Python and list diffenrences (/thread-35209.html) |
Compare two Excel sheets with Python and list diffenrences - dmkfon - Oct-09-2021 Hello, I have issue with my program. The aim of this program is to compare two Excel sheets and then list (display/print) differences in another sheet called resultats. I have an error with the sort values. These data frames are divided in 7 seven columns : Radical, Name, Parent, Trigram, Conuntry, GPC App Settings, Type. You can add whatever data you want Please how can I improve the code ? Here's the code down below : import pandas as pd import numpy as np # Define the diff function to show the changes in each field def report_diff(x): return x[0] if x[0] == x[1] else '{} ---> {}'.format(*x) # We want to be able to easily tell which rows have changes def has_change(row): if "--->" in row.to_string(): return "Y" else: return "N" # Read in both excel files df1 = pd.read_excel('Desktop/Modele_Tiers_Exemple.xlsx', 'tiersM-1', na_values=['NA']) df2 = pd.read_excel('Desktop/Modele_Tiers_Exemple.xlsx', 'tiersM', na_values=['NA']) # Make sure we order by account number so the comparisons work df1.sort_values(by=['Radical']) df1=df1.reindex() df2.sort_values(by=['Radical']) df2=df2.reindex() # Create a panel of the two dataframes diff_panel = pd.Panel(dict(df1=df1,df2=df2)) #Apply the diff function diff_output = diff_panel.apply(report_diff, axis=0) # Flag all the changes diff_output['has_change'] = diff_output.apply(has_change, axis=1) #Save the changes to excel but only include the columns we care about diff_output[(diff_output.has_change == 'Y')].to_excel('Desktop/Resultats.xlsx',index=False,columns=["Radical", " Name","Parent","Trigram","Country","GPC App Settings", "Type"]) RE: Compare two Excel sheets with Python and list diffenrences - Larz60+ - Oct-09-2021 you can use pandas, see: https://stackoverflow.com/a/37113788 The code provided on post above: import pandas as pd df1 = pd.read_excel('excel1.xlsx') df2 = pd.read_excel('excel2.xlsx') difference = df1[df1!=df2] print difference |