Python Forum

Full Version: Code not reading http link from .txt file (Beginner level)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello fellow users,

I have a doubt regarding a code I wrote which reads .txt files. This was a practice exercise I found on the net which prompts the student to write a code capable to read a .txt file and extract all the numbers in it. Later these numbers should be summed up and return the total value. This is what I got so far:
file = input("Enter the name of the file to read: ")
text = open(file)

import re
numbers = list()
counter = 0

for lines in text:
    words = lines.split()
    for word in words:
        num = re.findall('[0-9]+', word)
        if len(num) != 1: continue
        for i in num:
            ii = int(i)
            numbers.append(ii)
            counter = counter+1


print("There are ",counter, "numbers")
print("an this is the list of them:")
print(numbers) 
print("The total sum of the numbers is: ",sum(numbers))  
(The text to be read is the following:)

Output:
The even better news is that I already came up with a simple program to find the 342 most common word in a text file. I wrote it, tested it, and now I am giving it to you to use so you can save some time. You don't even need to know Python to use this program. You will need to get through 34 Chapter 44ten of this book to fully understand the awesome Python techniques that were used to make the program. You are22 the end user, you simply use the program and marvel at its cleverness and how it saved you so much 333 manual effort. You simply type the code into a file called words.py and run it or you download the source code from http://www.py4e.com/code3/ and run it. www.py8e.com 456 because 34 595.
Okay, it seems the code is able to find numbers and sum them up. But I've noticed that the result (9 numbers = 1868 total value) doesn't match the actual right solution (should be 10 numbers = 1875 total value). There's a deviation of 7, which I concluded is coming from the code not detecting the numbers in "http://www.py4e.com/code3/". However, the code IS able to detect the number in the other line "www.py8e.com".

So basically, it seems as the code is in trouble when it comes to a string with "http" on it, but don't understand why and how can I solve it.

Any ideas from you guys will be much appreciated.
Thanks :)
What is the purpose of line 12? Your problem has nothing to do with http. That just happens to be the only place in the text where you have two different numbers in the same "word".
(Dec-13-2020, 06:56 AM)bowlofred Wrote: [ -> ]What is the purpose of line 12? Your problem has nothing to do with http. That just happens to be the only place in the text where you have two different numbers in the same "word".

Hey bowlofred thanks for your reply.
I typed line 12 to avoid the "for" be actioned for empty lines. But I guess this time it's useless.
Regarding your suggestion, I'm aligned with you that the code is only getting the 1st number in the word. How could I make the code read ALL the numbers (even if two are in the same word).

Cheers
Just get rid of line 12. If it finds more than one number in the word, the continue on line 12 skips them.