Python Forum
undup - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: undup (/thread-18235.html)



undup - Skaperen - May-10-2019

i'm looking for a function to take a sequence that might have duplicates and return one of the same type with no more than one of each item. i only need it for lists for now.


RE: undup - buran - May-10-2019

set(iterable) is first that comes to mind, but as always - the devil is in the details:
- what type of elements are there (are they hashable), any nested structures and how to deal with such if present, do you keep the order, etc.


RE: undup - buran - May-10-2019

(May-10-2019, 06:56 AM)Skaperen Wrote: return one of the same type
Also on second read - do you mean dedup values or dedup types(i.e. only one str, int, etc...)?


RE: undup - Skaperen - May-10-2019

i do want to keep the original order, but i need to check if i really need that, otherwise set(interable) would do the job. it is values. in my current case there are about 100 strings, mostly different.


RE: undup - perfringo - May-11-2019

Assuming that objective is to dedupe hashable objects in iterable while maintaining order:


def dedupe(iterable):
    seen = set()
    for item in iterable:
        if item not in seen:
            yield item
            seen.add(item)



RE: undup - Skaperen - May-12-2019

(May-11-2019, 05:43 AM)perfringo Wrote: Assuming that objective is to dedupe hashable objects in iterable while maintaining order:


def dedupe(iterable):
    seen = set()
    for item in iterable:
        if item not in seen:
            yield item
            seen.add(item)

either hashable or references to the same object. so a set() won't do; it will need to be a list, at least for the non-hashables. deep leaf comparison is not needed.

so, it doesn't pre-exist.