Python Forum
Removal of duplicates - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Removal of duplicates (/thread-32276.html)



Removal of duplicates - teebee891 - Feb-01-2021

Hi everyone,

I have made an anagram below from a words.txt file.

with open('words.txt', 'r') as read:
    line = read.readlines()

def make_anagram_dict(line):
    word_list = {}

    for word in line:
        word = word.lower()
        key = ''.join(sorted(word))
        if key in word_list and len(word) > 5 and word not in word_list:
            word_list[key].append(word)
        else:
            word_list[key] = [word]

    return word_list

if __name__ == '__main__':
    word_list = make_anagram_dict(line)

    for key, words in word_list.items():
        if len(words) >:
            print('Key value' + ' '*len(key) + '|     words')
            print(key + ' '*len(key) + ':' + str(words))
            print('---------------------------------------------')
The output I get looks like this (on a random part)

Output:
Key value | words hortwy :['worthy\n', 'wrothy\n'] ---------------------------------------------
the problem is that in the words.txt file, It coins duplicates except for the capital letter at the start:
i.e Zipper and zipper. It therefore creates an anagram of zipper, when it shouldn't. I tried to fix it with the part in bold. I would really appreciate any help!


RE: Removal of duplicates - jefsummers - Feb-01-2021

One way to eliminate duplicates would be to convert to a set, then convert back again.
1. Read the words into a list.
2. Convert the items to lower case.
3. Copy the list to a set
4. Copy the set back to a list