Python Forum
average word length - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: average word length (/thread-11585.html)



average word length - syn09001 - Jul-17-2018

Trying to write a simple program that calculates the average length of words used in a sentence. My issue is that the spaces are taken into an account when counting characters, and that gives higher number for average. The replace("","") function had to eliminate spaces form the input, but it does not seem to work. Any ideas? Thank you.

def main():
    sentence = input("Enter text: ")
    words = len(sentence.split())
    chars = len(sentence.replace("",""))
    avg = chars / words
    print("Your average word length is:", round(avg))
main()



RE: average word length - micseydel - Jul-17-2018

I would guess that you intended
chars = len(sentence.replace("",""))
to be
chars = len(sentence.replace(" ",""))
(Note the extra character in the second line.)

Alternatively, you can take the sum of all the individual words produced by sentence.split(), to get the number of characters.


RE: average word length - perfringo - Jul-17-2018

In order to get accurate results you should also take care of punctuation marks. If entered text is 'Yes? No! Yes? No!' then average lenght is +1 compared to real average lenght of words.

One way to deal with it:

# eliminating whitespaces, tab, and other non-printable symbols
chars = ''.join(sentence.split()) 
# list of symbols you don't want to count as characters  
nonchars = [".", "!", "?", ",", ":", ";", "-", "'"] 
letters = len([char for char in chars if char not in nonchars])



RE: average word length - micseydel - Jul-17-2018

I think perfringo has the right idea, but that wouldn't capture double-quotes ("). I would suggest a whitelist rather than a blacklist
from string import letters
# [...]
letter_count = len(char for char in chars if char in letters)



RE: average word length - syn09001 - Jul-17-2018

Thank you everyone, your feedback did not only resolve my issue, but additionally instructed the concepts to me I previously had not known.

P.S. Regarding my post, I shall make the appropriate adjustments to my future posts. (It was my first time posting on this forum.)


RE: average word length - perfringo - Jul-18-2018

(Jul-17-2018, 04:32 PM)micseydel Wrote: I think perfringo has the right idea, but that wouldn't capture double-quotes ("). I would suggest a whitelist rather than a blacklist
from string import letters
# [...]
letter_count = len(char for char in chars if char in letters)

Whitelisting is definitely way to go! It is much better to allow specific set of letters instead of trying to guess what clever symbols users might enter.

It seems to me, that there is no 'letters' in string.py. I get ImportError: cannot import 'letters' from 'string'. Shouldn't it be:
from string import ascii_letters?


RE: average word length - micseydel - Jul-18-2018

(Jul-18-2018, 06:14 AM)perfringo Wrote: Shouldn't it be
You're right
Output:
$ python Python 2.7.15 (default, May 1 2018, 16:44:08) [GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from string import letters >>> letters 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' >>> $ python3 Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 03:03:55) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from string import letters Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: cannot import name 'letters' >>> from string import ascii_letters >>>
Thanks for the catch :)