I would like to take any text document and make a word list from the file with no duplicates. I don't know much about programming. This is the best I have been able to come up with. Thanks!
def MakeWordList():
with open('test.txt','r') as f:
data = f.read()
return set([word for word in data.split()])
print ()
Something you may need to look into.
In:
Output:
hello this is test.
Hello doing a test here.
Out:
Output:
{'doing', 'this', 'Hello', 'here.', 'a', 'test.', 'test', 'is', 'hello'}
Are
Hello hello
and
test test.
not duplicates?
#Assuming below is the input in the text file
Hi Good Morning! How are you? I am fine. How are studying?
I am Apple. I am doing fine. Thanks for asking.
I am April. I am ok. Not good
raw_data = open(r"C:\Users\***\Desktop\Python Forum\word_list.txt")
for line in raw_data:
unique_line_list = set(line.split())
print(unique_line_list)
Output:
{'How', 'Hi', 'Morning!', 'are', 'am', 'fine.', 'I', 'you?', 'studying?', 'Good'}
{'Apple.', 'for', 'doing', 'fine.', 'I', 'Thanks', 'asking.', 'am'}
{'good', 'April.', 'Not', 'I', 'ok.', 'am'}