Python Forum

Full Version: Pulling and comparing CSV data
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Need a script that will take a CSV file inputted by user (in the form of filepath with filename), another directory path containing files inputted by the user, find the CSV files within that directory, compare the data between the two files, and if there is a match, print the filepath\filename (if no match found, print no matches found). Script runs but no data appears, and I know there are matches. I'm close but I can't put it all together.

##Import all of the required modules
import os, re, sys, csv, glob

##Ask user to specify the directory that contains various CSV files
searchPath = raw_input('Specify directory containing CSV Files: ')

##Ask user to specify the location of the csv file reference information
referencePath = raw_input('Specify the full path to the Reference CSV file. This file should only contain the data you are searching for: ')

##Create an empty list to hold values from Excel CSV used for search.
##These are the CSV values we do not have info for.
xUnmatchedValues = []

##Open and read the CSV file line by line
with open(referencePath) as exceldata:

   for e in exceldata:
       ##As you read each line, strip out any extra white spaces or new lines        
       e = e.strip()
       ##Append each cleaned up line (in this case each line contains a single value) into the list declared above   
       xUnmatchedValues.append(e)

##xUnmatchedValues contains the list of reference data from reference csv

##Walk through directory to find csv files and read their data
for root, dirs, files in os.walk(searchPath):
   for file in files:
      if file.endswith(".csv"):
          f=open(file, 'r')
          for m in exceldata:
              m = m.strip()
              xPossibleMatches.append(m)
              file_name_path = os.path.join(root, file)
          ##Check to see if data from xUmatchedValues is in data from xPossibleMatches
          if any(i in xUnmatchedValues for i in xPossibleMatches):
              Print ("Match found! File Name:",file)
              Print ("Match found! File Path:",file_name_path)

          if not any(i in xUnmatchedValues for i in xPossibleMatches):
              Print ("No Match")
          f.close()

##Indicate to the user that the directory search has finished.
print "Directory Search Complete"
Debugging 101: Add print statements at strategic places to check the values of your variables... And where are you defining Print (with uppercase "p")?
Corrected the uppercase p for Print - getting error:
Traceback (most recent call last):
File "C:\Users\Generic\Downloads\PythonProjects\Matchv2.1.py", line 27, in <module>
f=open(file, 'r')
IOError: [Errno 2] No such file or directory: 'MatchRoot.csv'
you need to specify the full path for file to open - use os.path.join(root,file)
also it's better to use context manager - with statement when open the file. it will close it for you when not needed anymore