Posts: 13
Threads: 6
Joined: Oct 2017
May-25-2018, 10:23 AM
(This post was last modified: May-25-2018, 10:23 AM by sonicblind.)
Hi,
I have a list of class instances:
mylist = [ myClass(name=n1), myClass(name=n2), myClass(name=n1), myClass(name=n3) ] How can I get rid of duplicates in this list?
I know that the two instances with the name='n1' are in reality 2 different objects, so they are in reality not duplicates.
But for my project, having several instances with the same parameters means basically duplicates and I need to clean up the list accordingly.
Is there any easy way to do it?
Thanks.
Posts: 95
Threads: 1
Joined: Apr 2018
Posts: 8,155
Threads: 160
Joined: Sep 2016
May-25-2018, 11:46 AM
(This post was last modified: May-25-2018, 11:47 AM by buran.)
You need to implement some special methods in your class. Here is a quick example
class Person():
def __init__(self, name):
self.name = name
def __eq__(self, other):
return (self.name, ) == (other.name, )
def __hash__(self):
return hash((self.name,))
def __str__(self):
return self.name
person1 = Person('John')
person2 = Person('Allice')
person3 = Person('John')
persons = [person1, person2, person3]
print(persons)
for person in persons:
print(person)
unique_persons = set(persons)
print(unique_persons)
for person in unique_persons:
print(person) Output: [<__main__.Person instance at 0x02FB1B20>, <__main__.Person instance at 0x02FB53F0>, <__main__.Person instance at 0x02FB5418>]
John
Allice
John
set([<__main__.Person instance at 0x02FB1B20>, <__main__.Person instance at 0x02FB53F0>])
John
Allice
>>>
note that I have added __str__() just for more clarity in the example output
if for some reason you don't want/are not able to change the class
class Person():
def __init__(self, name):
self.name = name
def __str__(self):
return self.name
person1 = Person('John')
person2 = Person('Allice')
person3 = Person('John')
persons = [person1, person2, person3]
print(persons)
for person in persons:
print(person)
def dedup(persons):
unique = []
unique_names = set()
for person in persons:
if person.name not in unique_names:
unique.append(person)
unique_names.add(person.name)
return unique
unique_persons = dedup(persons)
print(unique_persons)
for person in unique_persons:
print(person) Output: [<__main__.Person instance at 0x02FD1E40>, <__main__.Person instance at 0x02FD53A0>, <__main__.Person instance at 0x02FD53C8>]
John
Allice
John
[<__main__.Person instance at 0x02FD1E40>, <__main__.Person instance at 0x02FD53A0>]
John
Allice
>>>
Posts: 2,953
Threads: 48
Joined: Sep 2016
What is the basis in order to compare classes? The __hash__ method?
Posts: 8,155
Threads: 160
Joined: Sep 2016
(May-25-2018, 12:23 PM)wavic Wrote: What is the basis in order to compare classes? no, comparison special methods - https://rszalski.github.io/magicmethods/#comparisons
__hash__() is to make it hashable. In this example case without it you will not be able to put instance of the class in a set.
Posts: 2,953
Threads: 48
Joined: Sep 2016
May-25-2018, 12:47 PM
(This post was last modified: May-25-2018, 12:49 PM by wavic.)
As I see it, what will be compared depends on me?
Btw, the link is great!
Posts: 13
Threads: 6
Joined: Oct 2017
Thank you all.
For my script, these methods are an overkill, but it's good to know the approach and that there is no magic trick to do it easily.
I will do it the way Buran suggested when I am not able to change the class definition. Basically create a separate list of unique values of instance names.
Posts: 8,155
Threads: 160
Joined: Sep 2016
May-25-2018, 02:05 PM
(This post was last modified: May-25-2018, 02:05 PM by buran.)
(May-25-2018, 01:50 PM)sonicblind Wrote: I will do it the way Buran suggested when I am not able to change the class definition. Basically create a separate list of unique values of instance names. Thinking some more on it, you can do like this
class Person():
def __init__(self, name):
self.name = name
def __str__(self):
return self.name
person1 = Person('John')
person2 = Person('Allice')
person3 = Person('John')
persons = [person1, person2, person3]
print(persons)
for person in persons:
print(person)
unique_persons = {person.name:person for person in persons}.values()
print(unique_persons)
for person in unique_persons:
print(person) Output: [<__main__.Person instance at 0x030A4A58>, <__main__.Person instance at 0x03121B20>, <__main__.Person instance at 0x03121E40>]
John
Allice
John
[<__main__.Person instance at 0x03121E40>, <__main__.Person instance at 0x03121B20>]
John
Allice
>>>
Making a dictionary and then taking the list of values(instances of the class). In this case it will keep the last one with the same name, while in the other example it will keep the first instance with given name. If you want to use a multi-attribute key, then use a tuple of these atributes as key.
Posts: 7,313
Threads: 123
Joined: Sep 2016
A example with set comprehension.
>>> persons = [person1, person2, person3]
>>> unique_persons = {person.name for person in persons}
>>> unique_persons
{'Allice', 'John'} If need to keep order buran dictionary(3.6+) will do that.
Just need a unique person list,set comprehension is good.
Posts: 8,155
Threads: 160
Joined: Sep 2016
May-25-2018, 06:39 PM
(This post was last modified: May-25-2018, 07:55 PM by buran.)
(May-25-2018, 06:09 PM)snippsat Wrote: Just need a unique person list,set comprehension is good. just a small note - that's unique person names list set, but OP wants unique list of instances
|