Python Forum
Compare two large CSV files for a match
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Compare two large CSV files for a match
#1
Hello, I am very new to python. trying to solve below issue.

We have two .csv files.

For example:
File Master: Column_A Column_B Column_C ..... Column Z
123 XYZ Z 1X
234 PQR Y 2X

File New: Column_C Column_A Colum_B
X 001 PQR
Y 123 XYZ
Y 234 PQR

Each file has similar data but not in the same order in terms of columns or rows. When there is a match between Master file and New file, Master file needs an update by adding new column and populate with Match or No Match. And also add weights, for example if ALL columns & values are matching then 1, partial match then 0.5 else 0

These files are large running into several GBs.

Please help!
Reply
#2
Some clarification is needed here. What counts as a partial match? Matching one column? Matching two columns? You say that when there's a match, the master needs to be updated with match or no match. Why would you ever update with no match if their is a match? Or do you want to add a column to the master file for every row in the new file with the degree of matchiness?

Also, what have you tried? We're not big on writing code for people here, but we would be happy to help you fix your code when you run into problems. When you do run into problems, please post your code in Python tags, and clearly explain the problem you are having, including the full text of any errors.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#3
(Apr-22-2019, 05:40 PM)ichabod801 Wrote: Some clarification is needed here. What counts as a partial match? Matching one column? Matching two columns? You say that when there's a match, the master needs to be updated with match or no match. Why would you ever update with no match if their is a match? Or do you want to add a column to the master file for every row in the new file with the degree of matchiness?

Also, what have you tried? We're not big on writing code for people here, but we would be happy to help you fix your code when you run into problems. When you do run into problems, please post your code in Python tags, and clearly explain the problem you are having, including the full text of any errors.

Thank You for your reply.

At least one column value match would be considered partial match, If ALL columns match then Full Match.
Ideally there would be two new columns (Match & Weight) in the master file. When there is match, it will display Match and then its weight. Hope it makes sense.

Here is the link I have found with code and trying to modify it for my needs. it is giving me all sorts of errors.

https://python-forum.io/Thread-How-to-co...+two+files
Reply
#4
Then post the code with the modifications you made, and the full text of any errors you got, as I described in my previous post.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Iterate 2 large text files across lines and replace lines in second file medatib531 13 760 Aug-10-2020, 11:01 PM
Last Post: medatib531
  Iterating Large Files Robotguy 10 984 Jul-22-2020, 09:13 PM
Last Post: Gribouillis
  Look for match in two files and print out in the first file Batistuta 0 416 Mar-03-2020, 02:27 PM
Last Post: Batistuta
  Handling Large XML Files (>10GB) in Python onlydibs 1 926 Dec-22-2019, 05:46 AM
Last Post: Clunk_Head
  How to match two CSV files timlamont 9 1,284 Oct-01-2019, 05:54 PM
Last Post: timlamont
  Segmentation fault with large files kusal1 3 640 Oct-01-2019, 07:32 AM
Last Post: Gribouillis
  Open and read multiple text files and match words kozaizsvemira 2 2,514 Sep-11-2019, 12:58 PM
Last Post: kozaizsvemira
  How can I compare Python XML-Files and add missing values from one to another kirat 2 677 Aug-30-2019, 12:17 PM
Last Post: perfringo
  Comparing values in large txt files StevenVF 2 797 Feb-28-2019, 09:07 AM
Last Post: StevenVF
  Download multiple large json files at once halcynthis 0 724 Feb-14-2019, 08:41 AM
Last Post: halcynthis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020