Python Forum
Creating a word list - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Creating a word list (/thread-16900.html)



Creating a word list - opencircles - Mar-19-2019

I would like to take any text document and make a word list from the file with no duplicates. I don't know much about programming. This is the best I have been able to come up with. Thanks!

 	
def MakeWordList():
    with open('test.txt','r') as f:
        data = f.read()
    return set([word for word in data.split()])

    print ()



RE: Creating a word list - snippsat - Mar-19-2019

Something you may need to look into.
In:
Output:
hello this is test. Hello doing a test here.
Out:
Output:
{'doing', 'this', 'Hello', 'here.', 'a', 'test.', 'test', 'is', 'hello'}
Are Hello hello and test test. not duplicates?


RE: Creating a word list - MohanReddy - Mar-19-2019

#Assuming below is the input in the text file

Hi Good Morning! How are you? I am fine. How are studying?
I am Apple. I am doing fine. Thanks for asking.
I am April. I am ok. Not good

raw_data = open(r"C:\Users\***\Desktop\Python Forum\word_list.txt")
for line in raw_data:
    unique_line_list = set(line.split())
    print(unique_line_list)
Output:
{'How', 'Hi', 'Morning!', 'are', 'am', 'fine.', 'I', 'you?', 'studying?', 'Good'} {'Apple.', 'for', 'doing', 'fine.', 'I', 'Thanks', 'asking.', 'am'} {'good', 'April.', 'Not', 'I', 'ok.', 'am'}