Python Forum
How to compare two files and Display different results for text and for INT
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to compare two files and Display different results for text and for INT
#1
hi,

I am trying to create a program that will compare two csv files and display the results in new csv file.
In csv files cells have text value and integer value as well. I want if the change occurs and the cell value is TEXT it should append True against that value in new csv file and if the change occurs and the cell value is Integer it should append this text "Result is positive: change in value" and "Result is negative: change in value"

Below are my codes:

import csv
with open('book1.csv', 'r') as t1:
    old_csv = t1.readlines()
with open('book2.csv', 'r') as t2:
    new_csv = t2.readlines()

with open('update.csv', 'w') as out_file:
    line_in_new = 0
    line_in_old = 0
    while line_in_new < len(new_csv) and line_in_old < len(old_csv):
        if old_csv[line_in_old] != new_csv[line_in_new]:
            out_file.write(new_csv[line_in_new])
        else:
            line_in_old += 1
        line_in_new += 1
I am also attaching the samples files for your convience. Please Guide

Attached Files

.csv   Book1.csv (Size: 1.29 KB / Downloads: 43)
.csv   Book2.csv (Size: 1.37 KB / Downloads: 48)
Reply
#2
Hi, i tried this way even but received the error

import pandas as pd

file1 = 'Book1.csv'
file2 = 'Book2.csv'
file3 = 'update.csv'

cols_to_show = ['XID', 'TCO', 'Payment Plan','Livable Area','Brochure', 'Banks']

old = pd.read_csv(file1)
new = pd.read_csv(file2)


def report_diff(x):
    return x[0] if x[1] == x[0] else '{0} --> {1}'.format(*x)


old['version'] = 'old'
new['version'] = 'new'

full_set = pd.concat([old, new], ignore_index=True)

changes = full_set.drop_duplicates(subset=cols_to_show, keep='last')

dupe_names = changes.set_index('XID').index.get_duplicates()

dupes = changes[changes['XID'].isin(dupe_names)]

change_new = dupes[(dupes['version'] == 'new')]
change_old = dupes[(dupes['version'] == 'old')]

change_new = change_new.drop(['version'], axis=1)
change_old = change_old.drop(['version'], axis=1)

change_new.set_index('XID', inplace=True)
change_old.set_index('XID', inplace=True)

diff_panel = pd.Panel(dict(df1=change_old, df2=change_new))
diff_output = diff_panel.apply(report_diff, axis=0)

changes['duplicate'] = changes['XID'].isin(dupe_names)
removed_names = changes[(changes['duplicate'] == False) & (changes['version'] == 'old')]
removed_names.set_index('XID', inplace=True)
new_name_set = full_set.drop_duplicates(subset=cols_to_show)

new_name_set['duplicate'] = new_name_set['XID'].isin(dupe_names)

added_names = new_name_set[(new_name_set['duplicate'] == False) & (new_name_set['version'] == 'new')]
added_names.set_index('XID', inplace=True)
print(added_names)
df = pd.concat([diff_output, removed_names, added_names], keys=('changed', 'removed', 'added'))
print(df)
df[cols_to_show].to_csv(file3)
and the error is ..


Error:
KeyError: "['XID'] not in index"
can anyone help me on this error?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Most Compatible Text Editor to Handle Large Files? Robotguy 2 2,337 Aug-18-2020, 03:51 PM
Last Post: FortyTwo
  Can python read Marathi text files and summarize them? mcp111 0 1,784 Mar-18-2020, 08:58 AM
Last Post: mcp111
  Text files vretenica 5 2,951 Jul-03-2019, 03:24 PM
Last Post: perfringo
  import/use data from text files MichealPeterson 1 3,285 Jun-28-2017, 08:51 AM
Last Post: buran
  read multiple .xlsx files and text files in a directory BNB 11 25,581 Jun-07-2017, 07:42 AM
Last Post: BNB

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020