Python Forum
Count & Sort occurrences of text in a file - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Count & Sort occurrences of text in a file (/thread-29501.html)



Count & Sort occurrences of text in a file - oradba4u - Sep-06-2020

ALL:
I have an ASCII text file that contains First and Last names of a phone list.
Each full name is on a newline.
I need a new list generated that is sorted by the number of occurrences these names appear in the text file.

for example:
the file names.txt contains:
bill smith
joe williams
bill smith
jane doe
joe williams
bill smith

the final list should look like this:
bill smith (3)
joe williams (2)
jane doe (1)

All the examples I see online break the text into a single letter list, which I DON'T WANT!

Thanks in advance for your help.


RE: Count & Sort occurrences of text in a file - perfringo - Sep-06-2020

No list literal in Python can look like desired “final list”.

What have you tried? Please provide code which does “single letter list” and maybe we can help.


RE: Count & Sort occurrences of text in a file - oradba4u - Sep-06-2020

from collections import Counter, defaultdict

f = open('phonebook.tmp', "r")
if f.mode == "r":
    text = f.read()


freqword = defaultdict(list)
for word, freq in Counter(text).items():
    freqword[freq].append(word)

# print in order of occurrence (with sorted list of words)
for freq in sorted(freqword):
    print('count {}: {}'.format(freq, sorted(freqword[freq])))



RE: Count & Sort occurrences of text in a file - buran - Sep-06-2020

you overcomplicate things
from collections import Counter
 
with open('dupes.csv', "r") as f:
    for name, name_count in Counter(f).items():
        print(f'{name.strip()}:{name_count}')
Output:
bill smith:3 joe williams:2 jane doe:1



RE: Count & Sort occurrences of text in a file - oradba4u - Sep-06-2020

That's GREAT, but how can I sort the final list according to the number of occurrences?


RE: Count & Sort occurrences of text in a file - ndc85430 - Sep-06-2020

The list's sort method or the sorted function take an argument called key that lets you specify a function to be used to obtain the value for each item to be used for sorting.


RE: Count & Sort occurrences of text in a file - buran - Sep-06-2020

(Sep-06-2020, 01:54 PM)ndc85430 Wrote: That's GREAT, but how can I sort the final list according to the number of occurrences?
you can use most_common() method of Counter:
Quote:Return a list of the n most common elements and their counts from the most common to the least. If n is omitted or None, most_common() returns all elements in the counter. Elements with equal counts are ordered in the order first encountered:

from collections import Counter
  
with open('dupes.csv', "r") as f:
    for name, name_count in Counter(f).most_common():
        print(f'{name.strip()}:{name_count}')



RE: Count & Sort occurrences of text in a file - oradba4u - Sep-06-2020

That did the trick. Thank you guys for all your help.
I'm a python newbie, but I understand most of the syntax, especially when i see an example for the first time.