Duplicated words in a list

pooyan89 · (This post was last modified: Jun-15-2019, 12:48 PM by pooyan89.)

Hi!
I have a task:

Write a function called has_duplicates that takes a list and returns True if there is any element that appears more than once. It should not modify the original list.

Here is my attempt but I dont know why I do fail! Sick

def has_dup(v):
    c=0
    
    for i in range(len(v)):
        for j in range(len(v)-1):
            if v[i]==v[j+1]:
                c=c+1
                print(c)
    if c>1:
        return True
    if c<=1:
        return False

If v=['car','bar','are']
Then has_dup(v) must be False but it gives me True every time ! why? Huh

v[0] is not the same as v[1] or v[2], v[1] is not either the same like v[2], so why it increases the value of c ?

ThomasL · (This post was last modified: Jun-15-2019, 12:58 PM by ThomasL.)

Think simpler!

def has_duplicates(source):
    return len(set(source)) != len(source)

has_duplicates([1,2,3,4,4,5,6,7,8])
True

pooyan89 · (This post was last modified: Jun-15-2019, 01:00 PM by pooyan89.)

(Jun-15-2019, 12:57 PM)ThomasL Wrote: Think simpler!
def has_duplicates(source):
    return len(set(source)) != len(source)

thank you but I have to solve it with loops, I wanna know why this code does not work correctly

ThomasL · Jun-15-2019, 01:03 PM

Then think about the start point of the second loop.

**Yoriz** · Jun-15-2019, 01:24 PM

If you add a print you'll see the following
when the first loop is on the second item 'bar', the second loop is also calls the second item.
when the first loop is on the third item 'are', the second loop also calls the third item.

def has_dup(v):
    c = 0

    for i in range(len(v)):
        for j in range(len(v)-1):
            print(f'index:{i} item:{v[i]} with index:{j+1} item:{v[j+1]}')
            if v[i] == v[j+1]:
                c = c+1
                print(c)
    if c > 1:
        return True
    if c <= 1:
        return False


v = ['car', 'bar', 'are']
has_dup(v)

Output:index:0 item:car with index:1 item:bar
index:0 item:car with index:2 item:are
index:1 item:bar with index:1 item:bar
1
index:1 item:bar with index:2 item:are
index:2 item:are with index:1 item:bar
index:2 item:are with index:2 item:are
2

you need to stop the comparison of the same index happening.

pooyan89 · Jun-15-2019, 01:27 PM

(Jun-15-2019, 01:24 PM)Yoriz Wrote: If you add a print you'll see the following
when the first loop is on the second item 'bar', the second loop is also calls the second item.
when the first loop is on the third item 'are', the second loop also calls the third item.
def has_dup(v):
    c = 0

    for i in range(len(v)):
        for j in range(len(v)-1):
            print(f'index:{i} item:{v[i]} with index:{j+1} item:{v[j+1]}')
            if v[i] == v[j+1]:
                c = c+1
                print(c)
    if c > 1:
        return True
    if c <= 1:
        return False


v = ['car', 'bar', 'are']
has_dup(v)
Output:index:0 item:car with index:1 item:bar
index:0 item:car with index:2 item:are
index:1 item:bar with index:1 item:bar
1
index:1 item:bar with index:2 item:are
index:2 item:are with index:1 item:bar
index:2 item:are with index:2 item:are
2
you need to stop the comparison of the same index happening.

Thank you!
Now I know what the error is! Big Grin

noisefloor · Jun-15-2019, 06:18 PM

Hi,

@pooyan89: iterating over an iterable with for x in range(...) and than using the index for accessing the item x of the iterable is a BIG anti-pattern. Simply don't do it. Python can directly iterate over iterables using for item ind iterable:. If you really need the index of item, use enumerate: for index, item in enumerate(iterable):

Except this, your code would only find duplicates if they follow each other like ['foo', 'bar', 'bar', 'spam'], but not ['bar', 'foo', 'spam', 'bar']. But the latter is requested in your homework.

Using the solution with set and the length comparison would be the way to do it, but if you need to use a loop, use a second list, the in-Operator and a comparison of the length:

>>> def has_duplicates(iterable):
...     other_list = []
...     for item in iterable:
...         if item not in other_list:
...             other_list.append(item)
...     if len(iterable) > len(other_list):
...         return True
...     else:
...         return False
... 
>>> foo = ['foo', 'bar' 'spam']
>>> bar = ['foo', 'bar', 'bar', 'spam']
>>> spam = ['bar', 'foo', 'spam', 'bar']
>>> has_duplicates(foo)
False
>>> has_duplicates(bar)
True
>>> has_duplicates(spam)
True
>>>

Regards, noisefloor

**perfringo** · (This post was last modified: Jun-15-2019, 08:21 PM by perfringo.)

Maybe implement short-circuiting behaviour? If first duplicate is encountered then it returns True and will not iterate over remaining items. Something like this:

def has_duplicate(iterable):
    seen = set()               # if items are not hashable then use list
    for item in iterable:
        if item in seen:
            return True
        seen.add(item)
    return False

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	capitalizing words in list	Truman	1	3,478	Feb-19-2018, 11:40 PM Last Post: Larz60+

Duplicated words in a list

User Panel Messages

Announcements