Removing duplicate Image

Evil_Patrick · (This post was last modified: Jan-26-2020, 03:48 AM by Evil_Patrick.)

(Jan-24-2020, 12:28 PM)snippsat Wrote: Removing duplicate files is normal to use Hash-based verification.
With Python can eg use hashlib.
So can example do it like this.

import hashlib
import os

def remove_duplicate(path):
    unique = {}
    os.chdir(path)
    for file in os.scandir(path):
        with open(file.name, 'rb') as f:
            filehash = hashlib.md5(f.read()).hexdigest()
            if filehash not in unique:
                unique[filehash] = file.name
            else:
                # Test print before removing
                print(f'Removing --> {unique[filehash]}')
                #os.remove(unique[filehash])

if __name__ == '__main__':
    path = r'C:\div_code\img'
    remove_duplicate(path)

Thanks Dude but The Pics are not duplicate the thumbnail of the Image is smaller in size and dimension so it has a different hash

(Jan-23-2020, 09:19 PM)benlyboy Wrote: The only problem with a set loop is that as you remove files the count will change. I think I would walk though the directory and read the last 5 characters of the file name string and if it =="thumb" then remove it.

I would have to sit down to write out the code( and the boss might get upset if I did that now ) but I know I've done something similar before.

Thank you so much Dude for your Idea.
Thanks to everyone for responding on my thread.
Finally all thumbnail Images are removed
Here is my code:

import os

dir = input("Enter folder location: ")

os.chdir(dir)

for files in list(os.listdir()):
    if files[-9:-4] == 'thumb':
        os.remove(files)

print("Thumbnails Removed")

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Removing duplicate list items	eglaud	4	2,716	Nov-22-2019, 08:07 PM Last Post: ichabod801
	removing duplicate numbers from a list	calonia	12	5,330	Jun-16-2019, 12:09 PM Last Post: DeaD_EyE

Removing duplicate Image

User Panel Messages

Announcements