Python Forum
Counting words in the last line of a file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Counting words in the last line of a file
#1
How do I count how many words there are in the last line of a file?

Here are the contents of the text file:

Quote:Welcome to your First Exam Recruit.
Only the best recruits can become agents.
Do you have what it takes?
We will test your knowledge with this field readiness exam.
It should be pretty simple, since you only know the basics so far.
Let's get started.
Best of luck recruit.

Here is the script I wrote including annotations explaining my understanding what what is going on, line by line:

with open('field.txt') as field_variable: # Opening the file as a variable
    field_variable = field_variable.readlines() # Reading the file
    list(field_variable) # Turning file contents into a list
    for i in field_variable: # Splitting each line in the list
        i.split()
    last = field_variable[-1] # Assigning the last line from the bottom by reverse slicing to a variable called, last
    for n in last: # Loop through last line
        num_chars = last.split(n) # Split last variable, assign to new variable, num_chars
        result = last.count(num_chars) # Now count number of list items
        print(result)     # Print final_chars
According to my interpreter, the problem is at line 9:

python---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-082decdb195e> in <module>
      7     for n in last: # Loop through last line:
      8         num_chars = last.split(n) # Split last variable, assign to new variable, num_chars
----> 9         final_chars = last.count(num_chars) # Now count number of list items
     10         print(final_chars)     # Print final_chars

TypeError: must be str, not list
Apparently Python is expecting “a string, not a list”.

There are three variables in operation at line 9: result, last and num_chars.

result is being assigned as the final integer (and end result or destination of this script - - an integer)
last contains the string of the last line of the text file, distinguished at line 6
num_chars is the integer count of the number of characters in the last line

Why is the number of characters from the last line not printing?

For my personal future reference, I’m working on Jose Portilla’s public GitHub repo for his open course on Udemy at this specific module: Python-Narrative-Journey/02-Field-Readiness-Exam-1/01-Field-Readiness-Exam-One.ipynb
Reply
#2
If memory is not the issue, then simple brute force: 'give me last line from list of lines, split it and give length' can deliver result:

>>> with open('last_row.txt', 'r') as lines:
...     print(len(lines.readlines()[-1].split()))
...
4
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#3
Just to show a way to not read whole file in memory(as readlines() dos),can use pass to get only last line.
with open('last_line.txt') as f:
    for line in f:
        pass

line = line.split()
print(f'Number of word: {len(line)}\nNumber of characters: {len("".join(line))}')
Output:
Number of word: 4 Number of characters: 18
Reply
#4
Yet another, more complicated way which should be efficient when files are large (uses built-in module mmap - Memory-mapped file support)

import os
import mmap

def last_row(filename, access=mmap.ACCESS_READ):
    size = os.path.getsize(filename)
    f = os.open(filename, os.O_RDONLY)
    with mmap.mmap(f, size, access=access) as m:
        return m[m.rfind(b'\n', 0, -1) + 1:].decode('utf-8').rstrip()
Quote:It should be emphasized that memory mapping a file does not cause the entire file to be read into memory. That is, it’s not copied into some kind of memory buffer or array. Instead, the operating system merely reserves a section of virtual memory for the file contents. As you access different regions, those portions of the file will be read and mapped into the memory region as needed. However, parts of the file that are never accessed simply stay on disk. This all happens transparently, behind the scenes.

Then just:

>>> print(len(last_row('field.txt').split()))
4
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#5
@perfringo, @snippsat: Thank you, both! You’ve provided some elegant alternatives to achieving my desired outcome. I have taken your suggestions and assembled a fresh script from scratch.

The second line in perfringo’s initial post includes is a one liner using a series of casting methods. I understand each operation, one after the other. Being the novice that I am, I have re-written it spread out over 3 lines because this is just my "naive" preference for now. See below for my script with this change.

I also really liked snippsat’s use of f-String formatting. Inside each dynamic placeholder in the string, snippsat performs the operations in their place, rather than declaring the variables and processing the operations outside and then slipping in the new variables in the string placeholders. Like, instead of print(f‘Number of characters: {len("".join(line))}'), It could be re-written as print(f'Number of characters: {variable_x}') with variable_x defined elsewhere in the script as len("".join(line)). I prefer it this way. Keeping this preference in mind, here is my script now:

def user_choice():
   choice = int(input("Which line do you want to count and print? (Choose within -7 to 6) "))
   if choice in range(-7,7):
       return choice
   else:
       print("You are either way too high or way too low! Try again.")
       final = user_choice()
       main(final)

def main(choice):   
   with open('field.txt', 'r') as lines:
       picked_line = lines.readlines()[choice]
       result_num = picked_line.split()
       length_of_line = len(result_num)
       num_chars = len("".join(picked_line))
       print(f'Here is the line that you picked:\n   "{picked_line}" \nThe number of words: {length_of_line}\nThe number of characters: {num_chars}')
       replay = str(input("Would you like to choose again? Y/N? "))
       if replay == "y":
           final = user_choice()
           main(final)
       if replay == "n":
           print("Seeya!")
           pass
       else:
           print("I'll take that as a 'No'. Goodbye for now!")
           pass

if __name__ == "__main__":
   final = user_choice()
   main(final)
It runs beautifully. I have also added a function which prompts the user to make their own choice about which line they want to count. I’ve also implemented a control conditionals prompting the user if they would like to play again. Here is some sample output:

Quote:$ python daniel-script3.py
Which line do you want to count and print? (Choose within -7 to 6) 22222222
You are either way too high or way too low! Try again.
Which line do you want to count and print? (Choose within -7 to 6) 2
Here is the line that you picked:
"Do you have what it takes?
"
The number of words: 6
The number of characters: 27
Would you like to choose again? Y/N? y
Which line do you want to count and print? (Choose within -7 to 6) 24
You are either way too high or way too low! Try again.
Which line do you want to count and print? (Choose within -7 to 6) 1
Here is the line that you picked:
"Only the best recruits can become agents.
"
The number of words: 7
The number of characters: 42
Would you like to choose again? Y/N? n
Seeya!

It’s awesome, isn’t it? I suppose the only outstanding issue I can't figure out is how to get the second quotation mark for the printed line to appear on one line instead of being disjointed on a second line. How could I fix this?

Otherwise, I am really pleased with how this has turned out so far.

Taking this script to the next level, I am trying to use a for loop and the enumerate function to show the user the available lines and their corresponding line numbers. I’ve called this function showcase(). See below. Although I’ve hit a wall here because now my interpreter is throwing all kinds of errors. I’ve tried swapping different methods in and out which changes the traceback each time but I can’t make the right changes to achieve the intended output.

def showcase():
   with open('field.txt', 'r') as lines:
       lines2 = lines.readlines().split()
       # lines = lines.split()
       for num, line in enumerate(lines2):
           print(num, line)

def user_choice():
   choice = int(input("Which line do you want to count and print? (Choose within -7 to 6) "))
   if choice in range(-7,6):
       return choice
   else:
       print("You are either way too high or way too low! Try again.")
       final = user_choice()
       main(final)

def main(choice):   
   with open('field.txt', 'r') as lines:
       picked_line = lines.readlines()[choice]
       result_num = picked_line.split()
       length_of_line = len(result_num)
       num_chars = len("".join(picked_line))
       print(f'Here is the line that you picked:\n   "{picked_line}" \nThe number of words: {length_of_line}\nThe number of characters: {num_chars}')
   # return (picked_line, length_of_line, num_chars)
       replay = str(input("Would you like to choose again? Y/N? "))
       if replay == "y":
           user_choice()
       if replay == "n":
           print("Seeya!")
           pass
       else:
           print("I'll take that as a 'No'. Goodbye for now!")
           pass

if __name__ == "__main__":
   showcase()
   final = user_choice()
   main(final)
What would you people recommend I try to fix the new function with the enumerate operation?
Reply
#6
(May-02-2019, 05:41 PM)Drone4four Wrote: I suppose the only outstanding issue I can't figure out is how to get the second quotation mark for the printed line to appear on one line instead of being disjointed on a second line. How could I fix this?

This happens because every line in file ends with newline ('\n'). You need to strip it away away either using general .strip('\n') or more specific .rstrip('\n') (row #12)

You may consider reading data into dictionary. This way it will be dictionary lookup not reading the whole file again if user chooses several times.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#7
Quote:This happens because every line in file ends with newline ('\n'). You need to strip it away away either using general .strip('\n') or more specific .rstrip('\n') (row #12)

Thanks @perfringo: Good catch! \n indicates a line break. To remove the \n I’ve invoked .rstrip, as you suggested. That fixed the issue. I added .rstrip to more than one place in my script.

I also figured out the issue with my showcase() function. Alls I had to do was remove the split() method and it runs perfectly now.

Here is what this function looks like now:

def showcase():
   with open('field.txt', 'r') as lines:
       lines2 = lines.readlines()
       for num, line in enumerate(lines2):
           print(num, line.rstrip())
Awesome! I’m stoked. It works.

I’m taking my script to the next level by trying to add a feature which prompts the user to enter a number if the user chooses a letter of the alphabet when asked for an integer.

Here was my awkward first iteration:

def user_choice():
   choice = input("\nWhich line do you want to count and print? (Choose between 0 to 6) > ")
   if choice.isalpha():
       print("I can't accept a letter of the alphabet. Choose a number between 0 and 6")
   if choice in range(0,7):
       return choice
   else:
       print("You are either way too high or way too low! Try again.")
       final = user_choice()
       main(final)
That didn’t work very well.

So I found detailed Stackoverflow question titled: “Asking the user for input until they give a valid response”. With the ensuing answers to that question, here is my second iteration where I attempt to implement the functionality with a while loop and the try operation:

def user_choice():
   choice = input("\nWhich line do you want to count and print? (Choose between 0 to 6) > ")
   while True:
       try:
           if choice in range(0,7):
               return choice
       except choice.isalpha():
           print("I can't accept a letter of the alphabet. Choose a number between 0 and 6")
       else:
           print("You are either way too high or way too low! Try again.")
           final = user_choice()
           main(final)
When I run the script, after the contents of the script print and the script prompts the user for an integer, it just keeps re-prompting the user with: “You are either way too high or way too low! Try again.” I figure this is because my while loop isn’t formed properly. There is no traceback and my Python linter plugin in VSC doesn’t draw my attention to any particular line so at least the code is valid.

What would you people recommend I could try next to make the script correctly check to ensure the user is entering integers and not letters of the alphabet, in particular inside my user_choice() function?

Here is the script in its current iteration in full:

# This is my second atttempt script adds the feature which ensures that the choice made by the user is not a letter of the alphabet except in this iteration, I use the 'try' operator

def showcase():
   with open('field.txt', 'r') as lines:
       lines2 = lines.readlines()
       for num, line in enumerate(lines2):
           print(num, line.rstrip())

def user_choice():
   choice = input("\nWhich line do you want to count and print? (Choose between 0 to 6) > ")
   while True:
       try:
           if choice in range(0,7):
               return choice
       except choice.isalpha():
           print("I can't accept a letter of the alphabet. Choose a number between 0 and 6")
       else:
           print("You are either way too high or way too low! Try again.")
           final = user_choice()
           main(final)

def main(choice):   
   with open('field.txt', 'r') as lines:
       picked_line = lines.readlines()[choice]
       picked_line = picked_line.rstrip()
       result_num = picked_line.split()
       length_of_line = len(result_num)
       num_chars = len("".join(picked_line))
       print(f'Here is the line that you picked:\n   "{picked_line}" \nThe number of words: {length_of_line}\nThe number of characters: {num_chars}')
       replay = str(input("Would you like to choose again? Y/N? "))
       if replay == "y":
           final = user_choice()
           main(final)
       if replay == "n":
           print("Seeya!")
           pass
       else:
           print("I'll take that as a 'No'. Goodbye for now!")
           pass

if __name__ == "__main__":
   showcase()
   final = user_choice()
   main(final)
Attached is the text file I am working with.

Attached Files

.txt   field.txt (Size: 273 bytes / Downloads: 115)
Reply
#8
My questions still stands:
Quote:What would you people recommend I could try next to make the script correctly check to ensure the user is entering integers and not letters of the alphabet, in particular inside my user_choice() function?

What might you people suggest?
Reply
#9
For validating user input something along those lines usually is used:

def validate(request, allowed_range):
    range_text = f'integer in range {min(allowed_range)} - {max(allowed_range)}'
         
    while True:
        answer = input(f'{request} (enter {range_text}) ')
        try:
            answer = int(answer)
            if answer in allowed_range:
                return answer
            raise ValueError
         
        except ValueError:
            print(f'Expected {range_text} but input was {answer} ')
             
       
# usage           
validate('Which line do you want to count and print? ', range(7))
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#10
Thank you, @perfringo! This runs beautifully. For my future reference and for the record, copied below is my script in full. You may notice a few small changes I made to your suggested functions, @perfringo. For example, I renamed the function to validate_and_choose(), for the try operation I added a conditional to trigger the raise ValueError operation, and I replaced the 7 parameter for the range function with 0,7. I realize that @perfringo's syntax is more Pythonic. But for my purposes, these are just some minor changes which help me understand, being the naive novice that I am. Thanks again, my friend! Big Grin

import pdb

# This is my seventh attempt at this text reading script. This was written with lots of help from perfringo on Python-forum.io, in particular

def showcase():
    with open('field.txt', 'r') as lines:
        lines2 = lines.readlines()
        for num, line in enumerate(lines2):
            # pdb.set_trace()
            print(num, line.rstrip())

def validate_and_choose(request, allowed_range):
    range_text = f'integer in range {min(allowed_range)} - {max(allowed_range)}'
    while True:
        answer = input(f'{request} (enter {range_text}) ')
        try:
            answer = int(answer)
            if answer in allowed_range:
                return answer        
            if answer not in allowed_range:
                raise ValueError  
        except ValueError:
            print(f'Expected {range_text} but input was {answer}. Try again! ')

def main(choice):    
    with open('field.txt', 'r') as lines:
        picked_line = lines.readlines()[choice]
        picked_line = picked_line.rstrip()
        result_num = picked_line.split()
        length_of_line = len(result_num)
        num_chars = len("".join(picked_line))
        print(f'Here is the line that you picked:\n   "{picked_line}" \nThe number of words: {length_of_line}\nThe number of characters: {num_chars}')
        replay = str(input("Would you like to choose again? Y/N? "))
        if replay == "y":
            final = validate_and_choose('Which line do you want to count and print? ', range(0,7))
            main(final)
        if replay == "n" or "N":
            print("Seeya!")
            pass
        else:
            print("I'll take that as a 'No'. Goodbye for now!")
            pass

if __name__ == "__main__":
    showcase()
    # pdb.set_trace()
    final = validate_and_choose('Which line do you want to count and print? ', range(0,7))
    # pdb.set_trace()
    main(final)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  File "<string>", line 19, in <module> error is related to what? Frankduc 9 12,389 Mar-09-2023, 07:22 AM
Last Post: LocklearSusan
  Getting last line of each line occurrence in a file tester_V 1 811 Jan-31-2023, 09:29 PM
Last Post: deanhystad
  Need to match two words in a line tester_V 2 845 Nov-18-2022, 03:13 AM
Last Post: tester_V
  Writing string to file results in one character per line RB76SFJPsJJDu3bMnwYM 4 1,305 Sep-27-2022, 01:38 PM
Last Post: buran
  Print to a New Line when Appending File DaveG 0 1,188 Mar-30-2022, 04:14 AM
Last Post: DaveG
  Find and delete above a certain line in text file cubangt 12 3,353 Mar-18-2022, 07:49 PM
Last Post: snippsat
  CSV to Text File and write a line in newline atomxkai 4 2,611 Feb-15-2022, 08:06 PM
Last Post: atomxkai
  writelines only writes one line to file gr3yali3n 2 2,289 Dec-05-2021, 10:02 PM
Last Post: gr3yali3n
  Extract a string between 2 words from a text file OscarBoots 2 1,826 Nov-02-2021, 08:50 AM
Last Post: ibreeden
  Generate a string of words for multiple lists of words in txt files in order. AnicraftPlayz 2 2,756 Aug-11-2021, 03:45 PM
Last Post: jamesaarr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020