Python Forum
Extending my text file word count ranker and calculator
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extending my text file word count ranker and calculator
#1
I am playing with large plain text files like Alice and Wonderland and trying to rank most commonly occurring words. Naturally, you’d expect to encounter many instances of “the”, “and”, “a”.

With a little help from @snippsat in my previous thread, check out this script we were working with:

from collections import Counter
import re
  
with open('Alice.txt') as f:
    text = f.read().lower()
  
words = re.findall('\w+', text)
top_10 = Counter(words).most_common(10)
for word,count in top_10:
    print(f'{word:<4} {"-->":^4} {count:>4}')
Here is the smooth output:
Output:
$ python with_word_count.py the --> 1818 and --> 940 to --> 809 a --> 690 of --> 631 it --> 610 she --> 553 i --> 543 you --> 481 said --> 462
It works really well. I have already extended it by adding a feature which provides the total word count. Here is the code I added:

 
wordlist = text.split()
print("A total of " + str(len(wordlist)) + " words can be found inside this text file.") 
I have now set out to extend the features of this script further. At this point right now I am just trying to re-organize and consolidate these operations into separate functions. The script looks alittle different. Here it is:

from collections import Counter
import re
 
def word_count(text):
    wordlist = text.split()
    print("A total of " + str(len(wordlist)) + " words can be found inside this text file.")

def rank_words():
    words = re.findall('\w+', text)
    top_10 = Counter(words).most_common(10)
    for word,count in top_10:
        print(f'{word:<4} {"-->":^4} {count:>4}')

def main():
    with open('Alice.txt') as f:
        text = f.read().lower()
        return text
        
if __name__ == '__main__':
    main()
    word_count(text)
    rank_words()
    pass 
Here is the output:
Output:
$ python with_word_count.py Traceback (most recent call last): File "with_word_count.py", line 21, in <module> word_count(text) NameError: name 'text' is not defined
The NameError points to the variable text which “isn’t defined”. The issue indicated here is when the variable text is referred to at line 21 when the word_count() function is called. But text is defined in main() which is the first function that I call at code execution as specified below my: if __name__ == '__main__':. If any of you are wondering why I chose to organize my script this way, I am following @ichabod801 example in another recent thread I was working on here. When word_count() is called, text should have already been returned in the previously called function, main(), right?

Would anyone care to elaborate on what the Python interpreter is saying in this traceback? What am I missing? What would I need fix in my script for it to run properly as intended?

Attached is the public domain text file I am working with.

Attached Files

.txt   Alice.txt (Size: 159.97 KB / Downloads: 438)
Reply


Messages In This Thread
Extending my text file word count ranker and calculator - by Drone4four - Jan-19-2019, 12:50 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Replace a text/word in docx file using Python Devan 4 3,293 Oct-17-2023, 06:03 PM
Last Post: Devan
Thumbs Up Need to compare the Excel file name with a directory text file. veeran1991 1 1,111 Dec-15-2022, 04:32 PM
Last Post: Larz60+
  Row Count and coloumn count Yegor123 4 1,321 Oct-18-2022, 03:52 AM
Last Post: Yegor123
  For Word, Count in List (Counts.Items()) new_coder_231013 6 2,574 Jul-21-2022, 02:51 PM
Last Post: new_coder_231013
  find some word in text list file and a bit change to them RolanRoll 3 1,519 Jun-27-2022, 01:36 AM
Last Post: RolanRoll
  python-docx regex: replace any word in docx text Tmagpy 4 2,216 Jun-18-2022, 09:12 AM
Last Post: Tmagpy
  Modify values in XML file by data from text file (without parsing) Paqqno 2 1,652 Apr-13-2022, 06:02 AM
Last Post: Paqqno
  Converted Pipe Delimited text file to CSV file atomxkai 4 6,949 Feb-11-2022, 12:38 AM
Last Post: atomxkai
Question Problem: Check if a list contains a word and then continue with the next word Mangono 2 2,488 Aug-12-2021, 04:25 PM
Last Post: palladium
  all i want to do is count the lines in each file Skaperen 13 4,816 May-23-2021, 11:24 PM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020