Python Forum
rename same file names in different directories
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
rename same file names in different directories
#1
I have a directory which has 10 other dirs with many files:

- directory
- dir1
- dir2
- dir3
- ...

How can i rename duplicate files from all dirs?
For ex. we have 2 files with name "file1.txt" in "dir1" and "dir2" and we must rename one.
Reply
#2
import os; base_dir='path/to/your/directory'; seen_files={}; [os.rename(os.pa Replace 'path/to/your/directory' with the actual path. This will rename duplicates by adding a counter.
Reply
#3
Here one way

import os

# Path for executing script
path = os.path.realpath(os.path.dirname(__file__))

# Get all folders in cyrrent folder
folders = [folder for folder in os.listdir(f'{path}') if os.path.isdir(f'{path}/{folder}')]

# Loop through folders
for folder in folders:

    # Get all files in each folder
    files = os.listdir(f'{path}/{folder}')

    # Loop through files and change file name by adding folder prefix
    for file in files:
        os.rename(f'{path}/{folder}/{file}', f'{path}/{folder}/{folder}_{file}')
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags
Download my project scripts


Reply
#4
see this post.
Reply
#5
The files are different, because they lie in different paths. Change 1, the others will not change.

I put a.txt and b.txt in the main folder and in 3 subfolders.

If you really want to do this, (I think it is unescessary), maybe like this:

from pathlib import Path

def myApp():
    mydir = Path('/home/pedro/temp/')
    file_name_list = [filename.name for filename in mydir.rglob("*") if filename.is_file()]
    len(file_name_list) # 95
    # can't do this with a generator can't use .count(f)
    duplicates = [f for f in file_name_list if file_name_list.count(f) > 1]
    # get tuples of name and number of duplicates
    duplicates = [(f, file_name_list.count(f)) for f in file_name_list if file_name_list.count(f) > 1]
    # get rid of the duplicates in duplicates
    duplicates_set = set(duplicates)
    # can't change sets so convert the set to a list
    dl = list(duplicates_set)
    # now change each tuple in dl to a list
    # because lists are mutable
    for i in range(len(dl)):
        dl[i] = list(dl[i])
    # a generator to get all file names
    file_name_gen = (filename for filename in mydir.rglob("*") if filename.is_file())
    # for each file name look if it is 1 of the lists in dl
    for f in file_name_gen:
        for d in dl:
            if d[0] == f.name:
                print(f.name)
                newname = f.rename(Path(f.parent, f"{f.stem}_{d[1]}_{f.suffix}"))
                print(newname)
                # reduce the duplicate count by 1
                d[1] = d[1] - 1
Now, the numbering starts from the highest number for each subset of different duplicate files and decreases.

Output:
myApp() a.txt /home/pedro/temp/a_4_.txt b.txt /home/pedro/temp/b_4_.txt a.txt /home/pedro/temp/arxiv.org/a_3_.txt b.txt /home/pedro/temp/arxiv.org/b_3_.txt a.txt /home/pedro/temp/asteriskmag.com/a_2_.txt b.txt /home/pedro/temp/asteriskmag.com/b_2_.txt a.txt /home/pedro/temp/20BE/a_1_.txt b.txt /home/pedro/temp/20BE/b_1_.txt
Reply
#6
Also when removing duplicates it can better to use file hash,then sure that file is a dublicate.
It's easy to use as hashlib and also pathlib is in standard library.
So rglob('*') is recursively and will iterate through all subdirectories.
from pathlib import Path
import hashlib
import os

def remove_duplicate(path: Path) -> Path:
    '''
    Choice <path> will recursively iterate through all subdirectories
    Remove comment <os.remove> and will delete duplicate
    '''
    unique = {}
    for file in Path(path).rglob('*'):
        if file.is_file():
            with open(file, 'rb') as f:
                filehash = hashlib.md5(f.read()).hexdigest()
                if filehash not in unique:
                    unique[filehash] = file
                else:
                    # Test print before removing
                    print(f'Removing --> {unique[filehash]}')
                    '''try:
                        os.remove(unique[filehash])
                    except OSError:
                        pass'''

if __name__ == '__main__':
    path = Path(r'G:\div_code\reader_env\my_folder')
    remove_duplicate(path)
Pedroski55 likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Extract and rename a file from an Archive tester_V 4 3,434 Jul-08-2024, 07:54 AM
Last Post: tester_V
  Organization of project directories wotoko 3 1,446 Mar-02-2024, 03:34 PM
Last Post: Larz60+
  Rename first row in a CSV file James_S 3 1,543 Dec-17-2023, 05:20 AM
Last Post: James_S
  Navigating file directories and paths inside Jupyter Notebook Mark17 5 7,455 Oct-29-2023, 12:40 PM
Last Post: Mark17
  rename file RolanRoll 0 1,057 May-18-2023, 02:17 PM
Last Post: RolanRoll
  '' FTP '' File upload with a specified string and rename midomarc 1 2,165 Apr-17-2023, 03:04 AM
Last Post: bowlofred
  Listing directories (as a text file) kiwi99 1 1,370 Feb-17-2023, 12:58 PM
Last Post: Larz60+
  rename and add desire "_date" to end of file name before extention RolanRoll 1 1,915 Jun-13-2022, 11:16 AM
Last Post: gruntfutuk
  I need to copy all the directories that do not match the pattern tester_V 7 4,537 Feb-04-2022, 06:26 PM
Last Post: tester_V
  Functions to consider for file renaming and moving around directories cubangt 2 2,588 Jan-07-2022, 02:16 PM
Last Post: cubangt

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020