Python Forum
Still playing with text files (Jose Portilla on Udemy)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Still playing with text files (Jose Portilla on Udemy)
#1
In a previous thread titled, “Counting words in the last line of a file” that I was working on where I explored how to analyze text files, @DeaD_EyE graciously stepped in to help rewrite the script from scratch to make it more pythonic. @DeaD_EyE’s post can be found on page #2 (post #12) of my thread. See the bottom for the current script that I am working with now.

It runs well and as intended. The script:
  • presents the user with different books to analyze,
  • shows the user how many lines there,
  • asks the user which line they want to print,
  • prints the line number
  • asks the user if they want to play again

Now I am trying to add this feature to that list:
  • prints the single line chosen by the user

To achieve that, here are the changes that I made so far. I have:
  • initialized an empty request variable at the top of the script,
  • asked the user to choose the line (within the validate and choose function),
  • printed the line.

When I run the script it turns out that it is print the selected line number twice. What I really want is for Python to print the chosen line once and the line number once. I partially understand that the request variable when asked for input (inside validate_and_shoose()), it’s picking the integer value but it’s not clear to me how I can extract the actual line contents chosen by the user. What would you ppl recommend?

Here is the latest iteration of my script:

#!/usr/bin/env python3
"""
Description

This is my tenth iteration of this text reading script. This iteration is based on DeaD_EyE's. DeaD_EyE's script runs well. It prompts the user to choose a line, counts the number of words and characters but I noticed that it doesn't actually print the selected line when the instructions say it will. here I attempt to add this missing functionality
"""
import sys
 

BOOKS = {1: 'Tolstoy.txt', 2: 'Alice.txt', 3: 'Chesterton.txt'}
request = None


def read_all_books():
    """
    Read all books from global variable BOOKS
    The keys are the digits
    """
    result = {}
    for number, filename in BOOKS.items():
        with open(filename) as fd:
            book_lines = fd.read().splitlines()
        result[number] = book_lines
        # maybe adding some metadata to the book
    return result
 
 
def which_book():
    """
    This function presents the user with a selection
    of 3 potential books to examine.
    """
    print("\nChoose from this list of books: \n 1. Tolstoy \n 2. Alice \n 3. Chesterton")
    while True:
        try:
            pick = int(input("What is your pick (1, 2, 3)? "))
        except ValueError:
            print(f'{pick} is not in the list. Enter a valid number in the range of available books.')
        return pick
 
 
def showcase(book):
    """
    This function essentially prints the entire book,
    line by line (but also prints the associated line numbers)
    """
    max_num = len(str(len(book))) + 1 # i know it's silly
    # just want to know how long the last linenumber is
    for num, line in enumerate(book):
        print(f'{num:>{max_num}}: {line.rstrip()}')
    # we don't return anything
    # the books are already loaded
 
  
def validate_and_choose(allowed_range):
    """
    This function ensures the user input is an integer
    and within the range of number of lines.
    """
    global request
    request = input('\nWhich line do you want to count and print? >> ')    
    range_text = f'integer in range {min(allowed_range)} - {max(allowed_range)}'
    while True:
        answer = input(f'Enter {range_text}: ')
        try:
            answer = int(answer)
            if answer in allowed_range:
                return answer        
            if answer not in allowed_range:
                raise ValueError  
        except ValueError:
            print(f'Expected {range_text} but input was "{answer}". Try again! ')
    
  
def again():
    """
    This function gives the user the ability to
     
    (1) start from the very beginning at the top,
    (2) restart half way through
    (3) exit
    """
    replay = input(
        "\n------------------------------------------------------\n" 
        "\nWould you like to choose a new line in the same book?\n" 
        "Or would you prefer to pick a line from a different book?\n"
        "\n'A' for the same book, \n'B' for a different book, "
        "or \n'C' to exit this program \n Make your selection: ")
    return replay.lower()
 
 
def main_menu():
    """
    Main menu + Loop for this game
    """
    print('Welcome to my game.')
    print('Maybe some help..')
    print(f'All books together take {utf8_in_memory / 1024**2:.2f} MiB in memory')
    # it's not right. It consumes more memory because of
    # the overhead of the dict itself and the list as holder for
    # the lines of the books
    print()
    book_key = which_book()
    # just a number, which is the key of the dict book_data
    while True:
        book = book_data[book_key]
        print()
        showcase(book)
        print()
        line_index = validate_and_choose(range(0, len(book)))
        words = len(book[line_index].split())
        # characters = len(book[line_index])
        characters = len(book[line_index].replace(' ', ''))
        # only characters, no whitespaces
        print(f'Here is the line # that you picked: "{line_index}"\n'
              f'Here is the content of the line that you picked:"{request}"\n'
              f'The number of words: {words}\n'
              f'The number of characters: {characters}')
        replay = again()
        if replay == 'a':
            continue
        elif replay == 'b':
            book_key = which_book()
        elif replay == 'c':
            print("Goodbye!")
            return 0
        else:
            print(
                "I'll take that answer as a request"
                "to exit this program. Goodbye for now!"
                )
            return 1
  
  
if __name__ == "__main__":
    book_data = read_all_books()
    # book_data on module leve.
    # it could be in main()
    # but then you must pass this around
    utf8_in_memory = sum(sys.getsizeof(line) for book in book_data.values() for line in book)
    try:
        retval = main_menu()
    except KeyboardInterrupt:
        retval = 10
        print('\nGoodbye!')
    sys.exit(retval) # maybe as information for other shell citizens
Here is my script on GitHub on the master branch. For my future reference, this exercise is part of Jose Portilla's Udemy course at this specific module: Python-Narrative-Journey/02-Field-Readiness-Exam-1/01-Field-Readiness-Exam-One.ipynb
Reply
#2
Howdy. I have made a few changes and observations regarding your code.
  • within the main loop you refer to a global variable request,
  • validate_and_choose sets this as an integer once,
  • each iteration of while you update a local answer but not request

  • Suggestion: don't use globals if you don't have to.
  • Also this function only returns an integer anyway.

  • Sugestion: you are asking for a choice, and then immediately afterwards asking again. Suggestion: ask once inside the while loop

Explanations: I removed the global request, in favour of returning the local answer from validate_and_choose(). The main application loop was very close, it had the correct line number. The key here is to realize that book is an array containing each line of your file. That is how you counted the number of words. It is also what showcase() iterates over. I also changed the variable name from words to word_count to make it more clear.

I would suggest learning to use PDB so that you can watch your program run step by step, this enables you to see exactly what the variable values are and will lead to ah-ha moments. One good resource for this How to Use Pdb to Debug Your Code though there are many around the internet. In my experience pudb is more visual and easier to use but look around. You could even use the Community version of PyCharms just for debugging purposes and continue to use whatever editor is your preference.

#!/usr/bin/env python3
"""
Description
This is my tenth iteration of this text reading script. This iteration is based on DeaD_EyE's. DeaD_EyE's script runs well. It prompts the user to choose a line, counts the number of words and characters but I noticed that it doesn't actually print the selected line when the instructions say it will. here I attempt to add this missing functionality
"""
import sys

BOOKS = {1: 'Tolstoy.txt', 2: 'Alice.txt', 3: 'Chesterton.txt'}


def read_all_books():
    """
    Read all books from global variable BOOKS
    The keys are the digits
    """
    result = {}
    for number, filename in BOOKS.items():
        with open(filename) as fd:
            book_lines = fd.read().splitlines()
        result[number] = book_lines
        # maybe adding some metadata to the book
    return result


def which_book():
    """
    This function presents the user with a selection
    of 3 potential books to examine.
    """
    print(
        "\nChoose from this list of books: \n 1. Tolstoy \n 2. Alice \n 3. Chesterton"
    )
    while True:
        try:
            pick = int(input("What is your pick (1, 2, 3)? "))
        except ValueError:
            print(
                f'{pick} is not in the list. Enter a valid number in the range of available books.'
            )
        return pick


def showcase(book):
    """
    This function essentially prints the entire book,
    line by line (but also prints the associated line numbers)
    """
    max_num = len(str(len(book))) + 1  # i know it's silly
    # just want to know how long the last linenumber is
    for num, line in enumerate(book):
        print(f'{num:>{max_num}}: {line.rstrip()}')
    # we don't return anything
    # the books are already loaded


def validate_and_choose(allowed_range):
    """
    This function ensures the user input is an integer
    and within the range of number of lines.
    """
    print('\nWhich line do you want to count and print? >> \n')
    range_text = f'integer in range {min(allowed_range)} - {max(allowed_range)}'
    while True:
        answer = input(f'Enter {range_text}: ')
        try:
            answer = int(answer)
            if answer in allowed_range:
                return answer
            if answer not in allowed_range:
                raise ValueError
        except ValueError:
            print(
                f'Expected {range_text} but input was "{answer}". Try again! ')


def again():
    """
    This function gives the user the ability to
      
    (1) start from the very beginning at the top,
    (2) restart half way through
    (3) exit
    """
    replay = input(
        "\n------------------------------------------------------\n"
        "\nWould you like to choose a new line in the same book?\n"
        "Or would you prefer to pick a line from a different book?\n"
        "\n'A' for the same book, \n'B' for a different book, "
        "or \n'C' to exit this program \n Make your selection: ")
    return replay.lower()


def main_menu():
    """
    Main menu + Loop for this game
    """
    print('Welcome to my game.')
    print('Maybe some help..')
    print(
        f'All books together take {utf8_in_memory / 1024**2:.2f} MiB in memory'
    )
    # it's not right. It consumes more memory because of
    # the overhead of the dict itself and the list as holder for
    # the lines of the books
    print()
    book_key = which_book()
    # just a number, which is the key of the dict book_data
    while True:
        book = book_data[book_key]
        print()
        showcase(book)
        print()
        line_index = validate_and_choose(range(0, len(book)))
        word_count = len(book[line_index].split())
        # characters = len(book[line_index])
        characters = len(book[line_index].replace(' ', ''))
        # only characters, no whitespaces
        print(f'Here is the line # that you picked: "{line_index}"\n'
              f'Here is the content of the line that you picked:'
              f'"{book[line_index]}"\n'
              f'The number of words: {word_count}\n'
              f'The number of characters: {characters}')
        replay = again()
        if replay == 'a':
            continue
        elif replay == 'b':
            book_key = which_book()
        elif replay == 'c':
            print("Goodbye!")
            return 0
        else:
            print("I'll take that answer as a request"
                  "to exit this program. Goodbye for now!")
            return 1


if __name__ == "__main__":
    book_data = read_all_books()
    # book_data on module leve.
    # it could be in main()
    # but then you must pass this around
    utf8_in_memory = sum(
        sys.getsizeof(line) for book in book_data.values() for line in book)
    try:
        retval = main_menu()
    except KeyboardInterrupt:
        retval = 10
        print('\nGoodbye!')
    sys.exit(retval)  # maybe as information for other shell citizens
Here is the diff output of your file vs mine. I did a bunch of formating so there will be many changes but only three relate to functionality

Output:
4d3 < 8,9c7 < < --- > 11,13c9,10 < request = None < < --- > > 26,27c23,24 < < --- > > 33c30,32 < print("\nChoose from this list of books: \n 1. Tolstoy \n 2. Alice \n 3. Chesterton") --- > print( > "\nChoose from this list of books: \n 1. Tolstoy \n 2. Alice \n 3. Chesterton" > ) 38c37,39 < print(f'{pick} is not in the list. Enter a valid number in the range of available books.') --- > print( > f'{pick} is not in the list. Enter a valid number in the range of available books.' > ) 40,41c41,42 < < --- > > 47c48 < max_num = len(str(len(book))) + 1 # i know it's silly --- > max_num = len(str(len(book))) + 1 # i know it's silly 53,54c54,55 < < --- > > 60,61c61 < global request < request = input('\nWhich line do you want to count and print? >> ') --- > print('\nWhich line do you want to count and print? >> \n') 68c68 < return answer --- > return answer 70c70 < raise ValueError --- > raise ValueError 72,74c72,75 < print(f'Expected {range_text} but input was "{answer}". Try again! ') < < --- > print( > f'Expected {range_text} but input was "{answer}". Try again! ') > > 84,85c85,86 < "\n------------------------------------------------------\n" < "\nWould you like to choose a new line in the same book?\n" --- > "\n------------------------------------------------------\n" > "\nWould you like to choose a new line in the same book?\n" 90,91c91,92 < < --- > > 98c99,101 < print(f'All books together take {utf8_in_memory / 1024**2:.2f} MiB in memory') --- > print( > f'All books together take {utf8_in_memory / 1024**2:.2f} MiB in memory' > ) 111c114 < words = len(book[line_index].split()) --- > word_count = len(book[line_index].split()) 116,117c119,121 < f'Here is the content of the line that you picked:"{request}"\n' < f'The number of words: {words}\n' --- > f'Here is the content of the line that you picked:' > f'"{book[line_index]}"\n' > f'The number of words: {word_count}\n' 128,131c132,133 < print( < "I'll take that answer as a request" < "to exit this program. Goodbye for now!" < ) --- > print("I'll take that answer as a request" > "to exit this program. Goodbye for now!") 133,134c135,136 < < --- > > 140c142,143 < utf8_in_memory = sum(sys.getsizeof(line) for book in book_data.values() for line in book) --- > utf8_in_memory = sum( > sys.getsizeof(line) for book in book_data.values() for line in book) 146c149 < sys.exit(retval) # maybe as information for other shell citizens --- > sys.exit(retval) # maybe as information for other shell citizens
Reply
#3
Hi @knackwurstbagel! Thank you for reading and experimenting with my script. Your reply contains the solution that I needed. I know my script at this point is large so I appreciate your time, my friend.

I don’t understand the diff output you shared. So instead I just copied your new script into a new text file and then compared yours with mine using my favourite difftool, p4merge. With p4merge I was able to see the formatting and cosmetic changes where you indented some of my strings and such to make it easier to read.

I had realized that global variables should be used sparingly but I wasn’t sure how else to process the line number. However now that you have made the trivial change (at line 120), it is clear how straight forward the solution was. Instead of passing in the request (as I had it), you just passed in book[line_index]. That was easy enough! =)

I’ve taken a look at your link to PyBites’ blog post on how to use the Python debugger. I’m not sure I understand it fully yet. I already have alotta new questions. I’m gonna jump in now and I'll write a new forum thread here with some more questions if need be.

Thanks again!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  azure TTS from text files to mp3s mutantGOD 2 1,638 Jan-17-2023, 03:20 AM
Last Post: mutantGOD
  Writing into 2 text files from the same function paul18fr 4 1,629 Jul-28-2022, 04:34 AM
Last Post: ndc85430
  Delete empty text files [SOLVED] AlphaInc 5 1,511 Jul-09-2022, 02:15 PM
Last Post: DeaD_EyE
  select files such as text file RolanRoll 2 1,129 Jun-25-2022, 08:07 PM
Last Post: RolanRoll
  Two text files, want to add a column value zxcv101 8 1,842 Jun-20-2022, 03:06 PM
Last Post: deanhystad
  select Eof extension files based on text list of filenames with if condition RolanRoll 1 1,475 Apr-04-2022, 09:29 PM
Last Post: Larz60+
  Separate text files and convert into csv marfer 6 2,798 Dec-10-2021, 12:09 PM
Last Post: marfer
  Sorting and Merging text-files [SOLVED] AlphaInc 10 4,757 Aug-20-2021, 05:42 PM
Last Post: snippsat
  Replace String in multiple text-files [SOLVED] AlphaInc 5 7,963 Aug-08-2021, 04:59 PM
Last Post: Axel_Erfurt
  Several pdf files to text mfernandes 10 5,709 Jul-07-2021, 11:39 PM
Last Post: Pedroski55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020