Python Forum
best option for comparing two csv files - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: best option for comparing two csv files (/thread-25903.html)



best option for comparing two csv files - zuzuzu - Apr-15-2020

Hi All,

I have the code below which compares 2 csv files and appends the differences found in the new_data file with file_with_all_data.
The aim of the code is to append new data to a historic file with all data.

I want to change the code to only compare data in the first column in both files and if there are differences, write the entire row to the difference file. Is this the best way to do this? Or would you recommend use the diff function?

import csv


with open('file_with_all_data.csv', 'r') as t1, open('new_data.csv', 'r') as t2:
    fileone = t1.readlines()
    filetwo = t2.readlines()

matches = []

with open('additions.csv', 'w') as outFile:
    for line in filetwo:
        if line not in fileone:
            matches.append(line)
            outFile.write(line)    

with open('file_with_all_data.csv', 'w' ) as outFile:
    outFile.write(''.join(fileone).strip() + '\n' + ''.join(matches))



RE: best option for comparing two csv files - Larz60+ - Apr-15-2020

on Linux, use diff from command line.
If you need to do it programmatically, look through: https://pypi.org/search/?q=%27file+diff%27
This lists all file diff packages in 'last updated' order.
You will have to examine to see which packages may be applicable.