Python Forum
Compare two Excel sheets with Python and list diffenrences
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Compare two Excel sheets with Python and list diffenrences
#1
Hello,
I have issue with my program. The aim of this program is to compare two Excel sheets and then list (display/print) differences in another sheet called resultats.
I have an error with the sort values.
These data frames are divided in 7 seven columns : Radical, Name, Parent, Trigram, Conuntry, GPC App Settings, Type.
You can add whatever data you want
Please how can I improve the code ?

Here's the code down below :
 
import pandas as pd
import numpy as np

# Define the diff function to show the changes in each field
def report_diff(x):
    return x[0] if x[0] == x[1] else '{} ---> {}'.format(*x)

# We want to be able to easily tell which rows have changes
def has_change(row):
    if "--->" in row.to_string():
        return "Y"
    else:
        return "N"

# Read in both excel files
df1 = pd.read_excel('Desktop/Modele_Tiers_Exemple.xlsx', 'tiersM-1', na_values=['NA'])
df2 = pd.read_excel('Desktop/Modele_Tiers_Exemple.xlsx', 'tiersM', na_values=['NA'])

# Make sure we order by account number so the comparisons work
df1.sort_values(by=['Radical'])
df1=df1.reindex()
df2.sort_values(by=['Radical'])
df2=df2.reindex()

# Create a panel of the two dataframes
diff_panel = pd.Panel(dict(df1=df1,df2=df2))

#Apply the diff function
diff_output = diff_panel.apply(report_diff, axis=0)

# Flag all the changes
diff_output['has_change'] = diff_output.apply(has_change, axis=1)

#Save the changes to excel but only include the columns we care about
diff_output[(diff_output.has_change == 'Y')].to_excel('Desktop/Resultats.xlsx',index=False,columns=["Radical",
                                                      " Name","Parent","Trigram","Country","GPC App Settings", "Type"]) 

Attached Files

.xlsx   Modele_Tiers_Exemple.xlsx (Size: 10.09 KB / Downloads: 258)
Reply
#2
you can use pandas, see: https://stackoverflow.com/a/37113788

The code provided on post above:
import pandas as pd

df1 = pd.read_excel('excel1.xlsx')
df2 = pd.read_excel('excel2.xlsx')

difference = df1[df1!=df2]
print difference
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  list from excel devilonline 4 287 Jun-22-2022, 11:00 PM
Last Post: devilonline
Question How to make a 3D List of Excel Spreadsheets? chatguy 4 1,551 Jan-24-2021, 05:24 AM
Last Post: buran
  Compare response and name list in experiment knoxvillerailgrind 3 1,398 Jul-26-2020, 12:23 PM
Last Post: deanhystad
  identical cells in 2 different excel sheets python pandas esso 0 1,010 Jul-19-2020, 07:50 PM
Last Post: esso
  Convert Excel to complex list and2handles 1 1,283 Jun-23-2020, 01:51 PM
Last Post: DPaul
  Trying to color an excel row based on list curranjohn46 2 4,716 May-19-2020, 10:35 AM
Last Post: KavyaL
  Compare Two Lists and Replace Items In a List by Index nagymusic 2 1,977 May-10-2020, 05:28 AM
Last Post: deanhystad
  Compare 5 variables in a record with an excel sheet!! SEED 1 1,123 Apr-20-2020, 11:10 PM
Last Post: michael1789
  How to list out specific excel files ajay_pal7 2 1,726 Mar-10-2020, 05:43 AM
Last Post: Larz60+
  how to compare two different size images in python and find corresponding pixel value squidsirymchenry 1 2,907 Feb-03-2020, 06:48 AM
Last Post: michael1789

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020