Dec-06-2023, 09:59 PM
Hello, I'm pretty new to coding and have started attending an online course on programming techniques. I'm stuck on one part of an assignment where I have to return the 5 most frequent words in a text file, disregarding any stop words. I'm not allowed to use any modules. My function only returns the index numbers and not the actual words. Does anyone have any tips on what I can do?
# Returns 5 most frequent words in a text def important_words(an_index, stop_words): mydict = index_text(an_index) # takes the result from index_text function all_words = [] # Create an empty list to put in all words from an_index in # Tar bort stop_words från text for item in stop_words: if item in an_index: del an_index[item] # Combine all the words into a single list for key in mydict: all_words.extend(mydict[key]) # Count occurrences of each word word_counter = {} # This dictionary stores words as keys and their respective numbers as values for word in all_words: if word not in stop_words: # If the word is a stop-word, it's ignored and not included in word_counter if word in word_counter: word_counter[word] += 1 # If the word has already been inserted into the dictionary, it adds to the count by 1 else: word_counter[word] = 1 # If the word isn't already in word_counter, it will be added sorted_words = sorted(word_counter.items(), key=lambda x: x[1], reverse=True) # Sorts tuple, x[1] takes the second elements of the tuple (the values). # The order of the tuple is reversed, starting with the largest values to the smallest. top_words = [word[0] for word in sorted_words[:5]] return top_words # Returns the five most frequent words in the text