Python Forum
Python word counter and ranker
Thread Rating:
  • 4 Vote(s) - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python word counter and ranker
#1
I’ve got an older Python 2 script from an outdated Udemy course. The script opens any basic raw text file (such as a large public domain novel like Alice and Wonderland), counts all the words and ranks the top 10 most common occurrences. Naturally, you can expect many occurrences of ‘the’, ‘is’, ‘a’.

It runs as expected using the Python 2 interpreter. Attached is Alice and Wonderland in .txt format. Here is the Python 2 script:

#!/usr/bin/env python
# encoding: utf-8
"""
alice_file.py

Created by Jason Elbourne on 2011-12-29.
Copyright (c) 2011 Jason Elbourne. All rights reserved.
"""
import operator

## Get each word - Turn to Lower case (.lower())
## Count Duplicates of words
## Dictionary {word:count,word2:count2}
## Sort this based on most used word
## Print the Top 20 Words

def rank_words(f):
	"""
		Takes in a file, then ranks all the words within the file
		
		Args: a file
		
		Return: A sorted list of tuples
	"""
	word_dict = {} # Start with empty python Dictionary
	words = [] # Start with empty python List
	for line in f:
		list_of_words = line.split()
		for w in list_of_words:
			words.append(w.lower()) # Add Word to List

	for word in words:
		if word_dict.has_key(word):
			word_dict[word] += 1 # Incr. value in Dict.
		else:
			word_dict[word] = 1 # Add word and value to Dict.
        # This will sort the dictionary and return a list of Tuples
	return sorted(word_dict.iteritems(), reverse=True, \
					key=operator.itemgetter(1))


def main():
	# Files
	f = open('Alice.txt', 'rU')

	ranked_words_list = rank_words(f)

	f.close()

        # Print the results
	for w in list(ranked_words_list[:10]):
		print w[0],"---", w[1]


if __name__ == '__main__':
	main()
Here is the expected output:

Quote:$ python2 pycounter.py
the --- 1605
and --- 766
to --- 706
a --- 614
she --- 518
of --- 493
said --- 421
it --- 362
in --- 351
was --- 333

It runs. Pretty neat, eh?

But in it’s first run using the Python 3 interpreter, the trace back points to line 52:

Quote:$ python pycounter.py
File "pycounter.py", line 52
print w[0],"---", w[1]
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(w[0],"---", w[1])?

So I add the parenthesis before the w and after the second slice at that line.

When I run the script next I get this trace back:

Quote:$ python pycounter.py
pycounter.py:44: DeprecationWarning: 'U' mode is deprecated
f = open('Alice.txt', 'rU')
Traceback (most recent call last):
File "pycounter.py", line 56, in <module>
main()
File "pycounter.py", line 46, in main
ranked_words_list = rank_words(f)
File "pycounter.py", line 38, in rank_words
return sorted(word_dict.iteritems(), reverse=True, \
AttributeError: 'dict' object has no attribute 'iteritems'

The first issue is the U parameter for the open function which is no longer usable in Python 3. The official docs say so here. So I remove the U. Problem solved. But I can’t make sense of the other lines indicated in the trace back. Line 56 is the module’s __name__. I’m not sure what the problem is here. It looks normal and correct to me.

Could someone here lend a helping hand to get this script to run in Python 3?
Reply


Messages In This Thread
Python word counter and ranker - by Drone4four - Jan-14-2019, 02:30 AM
RE: Python word counter and ranker - by snippsat - Jan-14-2019, 06:39 AM
RE: Python word counter and ranker - by Drone4four - Jan-15-2019, 02:00 AM
RE: Python word counter and ranker - by snippsat - Jan-15-2019, 02:40 AM
RE: Python word counter and ranker - by Drone4four - Jan-16-2019, 02:25 AM
RE: Python word counter and ranker - by snippsat - Jan-16-2019, 07:04 AM
RE: Python word counter and ranker - by Drone4four - Jan-18-2019, 11:58 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
Question Problem: Check if a list contains a word and then continue with the next word Mangono 2 2,639 Aug-12-2021, 04:25 PM
Last Post: palladium
  Python Speech recognition, word by word AceScottie 6 16,245 Apr-12-2020, 09:50 AM
Last Post: vinayakdhage
  print a word after specific word search evilcode1 8 5,064 Oct-22-2019, 08:08 AM
Last Post: newbieAuggie2019
  How to print counter without bracket in python and sort data. phob0s 1 2,869 Jul-25-2019, 05:33 PM
Last Post: ichabod801
  Extending my text file word count ranker and calculator Drone4four 8 5,515 Jan-25-2019, 08:25 AM
Last Post: steve_shambles
  difference between word: and word[:] in for loop zowhair 2 3,800 Mar-03-2018, 07:24 AM
Last Post: zowhair
  python word-docx jon0852 0 3,357 Sep-01-2017, 04:54 AM
Last Post: jon0852

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020