Python Forum
Compare filename with folder name and copy matching files into a particular folder
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Compare filename with folder name and copy matching files into a particular folder
#1
I am trying to write a python script that will use a regex to compare the file name that is present in the input folder with the output folder and if it matches, we, will copy that file from that input folder to another output folder.

Remember, we are comparing the filename that is present in the input folder name with the output folder name.

For Example:

Input folder: The filename looks like this "A10620.csv"

Output folder 1: There is one folder with the name "A10620_RD19CAR2000" besides the other folder name.

In output folder 1, I need to search the folder with that filename(See first 6 characters only), If they match, then we copy the files.

I need to search for that filename in two folder locations(Output folder 1 & Outputfolder2), If the folder does not found in location 2. Then we dumps the files in the "Files not found" folder at location 3.

Please see attached picture to get an idea of how the folder structure looks.

Here is my python script.
import os
import re
import shutil

# Traverse each file
sourcedir = "C:\\Users\\ShantanuGupta\\Desktop\\OzSc1\\InputFolder"
# Search for folder in Location 1
destinationdir1 = r"C:\Users\ShantanuGupta\Desktop\OzSc1\OutputFolder1"
# Search for folder in Location 2
destinationdir2 = "C:\\Users\\ShantanuGupta\\Desktop\\OzSc1\\OutputFolder2"
# Put files in folder "Folder Not Found"
destinationdir3 = "C:\\Users\\ShantanuGupta\\Desktop\\OzSc1\\OutputFolder3\\FoldersThatCouldNotBeFound"

regex = re.compile(r'^A\d{5}', re.IGNORECASE) #Checking only first six characters

count = 0
for files in os.listdir(sourcedir):  # looping over different files
    if regex.match(files):
        print(files)

    found = False

    # Search for a folder in Location 1
    for folderName in os.listdir(destinationdir1):
        if regex.match(folderName):
            print(folderName)
            # Copy the files from the input folder to output folder
            shutil.copy(sourcedir+'/'+files, destinationdir1+'//'+folderName)
            found = True
            break

    if not found:
        print('folder not found in Location1')
        count = 1

    # Search for a folder in Location 2
    for folderName in os.listdir(destinationdir2):
        if regex.match(folderName):
            #print(folderName)
            # Copy the files from the input folder to output folder
            shutil.copy(sourcedir+'/'+files, destinationdir1+'/'+folderName+'/'+files)
            found =  True
            break

    if not found:
        print('folder not found in Location2')
        count = 2

    # Folder Not Found
    if not found:
        print('copyingfilesinfoldernotfound')
        # Copy the files from the input folder to folder not found
        shutil.copy(sourcedir+'/'+files, destinationdir3+'/'+files)
Problems:

In the input folder there are multiple files, I am having a difficulty in make a logic how to get the filename for 1 file at a time and search in different folder location. Then go for second file name and search in different folders and so on...

Is there any better way to write this code?

Attached python code and folder structure here

Attached Files

Thumbnail(s)
       

.py   test.py (Size: 1.82 KB / Downloads: 220)
Reply
#2
Well, offcourse there will be a better way or at least a different way. But I think I would be helpfull to organize your code a little first. I'm sure your code will work eventually, but putting al those actions, if statements etc. in just one script makes it hard to read, but more important it makes it hard to add new code.

For example, the loops you use for searches in folders could be a function that returns a value such as True or False. The function will be reusable as many times you want, no matter how many folders you want to search.

I do realize this is not a perfect answer to your question. But it is quite difficult to do suggestion for your code without having to rewrite it from the start. Working with smaller pieces of code will help to do suggestions.
- Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid. Albert Einstein
Reply
#3
Is this what you're looking for?
from pathlib import Path
import os
import re
import shutil
import sys


class FileOperations:
    def __init__(self,
        input_folder_name,
        compare_folder_name,
        output1_folder_name,
        output2_folder_name):

        # Note -- You can change directory locations to suit your needs.
        self.regex = re.compile(r'^A\d{5}', re.IGNORECASE)

        self.InputFolder = Path(input_folder_name)
        self.CompareFolder = Path(compare_folder_name)
        self.OutputFolder1 = Path(output1_folder_name)
        self.OutputFolder2 = Path(output2_folder_name)

    def get_dir(self, foldername):
        return [files for files in foldername.iterdir() if files.is_file()]
    
    def copy_matching_files(self):        

        input_folder_files = self.get_dir(self.InputFolder)
        compare_folder_files = self.get_dir(self.CompareFolder)

        for filei in input_folder_files:
            comp_file_equalalent = self.CompareFolder / filei.name
            if re.match(self.regex, str(filei.stem)) and comp_file_equalalent.exists():
                    outfile1 = self.OutputFolder1 / filei.name
                    shutil.copyfile(filei, outfile1)
            else:
                outfile2 = self.OutputFolder2 / filei.name
                shutil.copyfile(filei, outfile2)

    def display_files_in_dir(self, dirname, fullpath=False):
        print(f"\nContets pf {dirname}:")
        for file in self.get_dir(dirname):
            if fullpath:
                print(f"{file.resolve()}")
            else:
                print(f"{file.name}")


class MyProg:
    def __init__(self):
        # assure base starting fiurectory same as script (Change as necessary)
        os.chdir(os.path.abspath(os.path.dirname(__file__)))

        # replace paths with real locations
        self.file_ops = FileOperations(
            './data/input', 
            './data/compare', 
            './data/output1',
            './data/output2')
    
    def move_files(self):
        self.file_ops.copy_matching_files()

        print(f"\nFiles that match:")
        for file in self.file_ops.get_dir(self.file_ops.OutputFolder1):
            print(file.name)
        
        print(f"\nFiles that did not match:")
        for file in self.file_ops.get_dir(self.file_ops.OutputFolder2):
            print(file.name)


def testit():
    mp = MyProg()
    print(f"\nBefore Move:")
    mp.file_ops.display_files_in_dir(mp.file_ops.InputFolder)
    mp.file_ops.display_files_in_dir(mp.file_ops.CompareFolder)
    mp.file_ops.display_files_in_dir(mp.file_ops.OutputFolder1)
    mp.file_ops.display_files_in_dir(mp.file_ops.OutputFolder2)

    mp.move_files()

    print(f"\nAfter Move:")
    mp.file_ops.display_files_in_dir(mp.file_ops.InputFolder)
    mp.file_ops.display_files_in_dir(mp.file_ops.CompareFolder)
    mp.file_ops.display_files_in_dir(mp.file_ops.OutputFolder1)
    mp.file_ops.display_files_in_dir(mp.file_ops.OutputFolder2)

if __name__ == '__main__':
    testit()
Output:
Before Move: Contets pf data/input: A10621.sif B5678.txt TestData1000.csv TestData10000.csv TestData10001.csv TestData10002.csv TestData10003.csv TestData1001.csv TestData1003.csv Z011014.sif Contets pf data/compare: A10621.sif B5678.txt TestData1000.csv TestData10000.csv TestData1001.csv TestData1003.csv Contets pf data/output1: Contets pf data/output2: Files that match: A10621.sif Files that did not match: B5678.txt TestData1000.csv TestData10000.csv TestData10001.csv TestData10002.csv TestData10003.csv TestData1001.csv TestData1003.csv Z011014.sif After Move: Contets pf data/input: A10621.sif B5678.txt TestData1000.csv TestData10000.csv TestData10001.csv TestData10002.csv TestData10003.csv TestData1001.csv TestData1003.csv Z011014.sif Contets pf data/compare: A10621.sif B5678.txt TestData1000.csv TestData10000.csv TestData1001.csv TestData1003.csv Contets pf data/output1: A10621.sif Contets pf data/output2: B5678.txt TestData1000.csv TestData10000.csv TestData10001.csv TestData10002.csv TestData10003.csv TestData1001.csv TestData1003.csv Z011014.sif
Jeff900 likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Why is the copy method name in python list copy and not `__copy__`? YouHoGeon 2 241 Apr-04-2024, 01:18 AM
Last Post: YouHoGeon
  Deleting Windows temp folder Raysz 7 387 Apr-02-2024, 12:36 PM
Last Post: Raysz
  Help with creating folder and "virtual environment" AudunNilsen 1 218 Mar-21-2024, 04:41 AM
Last Post: deanhystad
Question How to add Python folder in Windows Registry ? Touktouk 1 243 Feb-20-2024, 01:04 PM
Last Post: DeaD_EyE
  Copy Paste excel files based on the first letters of the file name Viento 2 420 Feb-07-2024, 12:24 PM
Last Post: Viento
  Create dual folder on different path/drive based on the date agmoraojr 2 429 Jan-21-2024, 10:02 AM
Last Post: snippsat
  Compare folder A and subfolder B and display files that are in folder A but not in su Melcu54 3 525 Jan-05-2024, 05:16 PM
Last Post: Pedroski55
  problem in import module from other folder akbarza 5 1,389 Sep-01-2023, 07:48 AM
Last Post: Gribouillis
  Reading a file name fron a folder on my desktop Fiona 4 894 Aug-23-2023, 11:11 AM
Last Post: Axel_Erfurt
  Rename files in a folder named using windows explorer hitoxman 3 734 Aug-02-2023, 04:08 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020