Jan-26-2020, 03:48 AM
(This post was last modified: Jan-26-2020, 03:48 AM by Evil_Patrick.)
(Jan-24-2020, 12:28 PM)snippsat Wrote: Removing duplicate files is normal to use Hash-based verification.
With Python can eg use hashlib.
So can example do it like this.
import hashlib import os def remove_duplicate(path): unique = {} os.chdir(path) for file in os.scandir(path): with open(file.name, 'rb') as f: filehash = hashlib.md5(f.read()).hexdigest() if filehash not in unique: unique[filehash] = file.name else: # Test print before removing print(f'Removing --> {unique[filehash]}') #os.remove(unique[filehash]) if __name__ == '__main__': path = r'C:\div_code\img' remove_duplicate(path)
Thanks Dude but The Pics are not duplicate the thumbnail of the Image is smaller in size and dimension so it has a different hash
(Jan-23-2020, 09:19 PM)benlyboy Wrote: The only problem with a set loop is that as you remove files the count will change. I think I would walk though the directory and read the last 5 characters of the file name string and if it =="thumb" then remove it.
I would have to sit down to write out the code( and the boss might get upset if I did that now ) but I know I've done something similar before.
Thank you so much Dude for your Idea.
Thanks to everyone for responding on my thread.
Finally all thumbnail Images are removed
Here is my code:
import os dir = input("Enter folder location: ") os.chdir(dir) for files in list(os.listdir()): if files[-9:-4] == 'thumb': os.remove(files) print("Thumbnails Removed")