Python Forum

Full Version: Sum similar items
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Greetings!
I'm trying to sum(count) similar items in a file.
File example:
PRQ09_PCX0161Host,DV1
PRQ09_PCX0170Host,PHQ
PRQ09_PCX0171Host,ATC
PRQ09_PCX0173Host,ATC
PRQ09_PCX0175Host,ATC
PRQ09_PCX0176Host,PHQ
PRQ09_PCX0179Host,DG2
PRQ09_PCX0180Host,TGP_H
PRQ09_PCX0183Host,PHQ
PRQ09_PCX0184Host,PHQ
PRQ09_PCX0280Host,TCP_H
PRQ09_PCX0380Host,TCP_H
I thought I could use 'counter' and it does count very nice and produces the right output but I cannot access the Key-Value for some reason.

Code:
prdcs = []

with open (prod_BY_host, 'r') as pr_host :
    for lnf in pr_host :
        lnf=lnf.strip()
        host,prName = lnf.split(',')
        prdcs.append(prName)
    print(Counter(prdcs))
    mout.write(str(Counter(prdcs)))
mout.close()  
It prints :
Counter({'PHQ': 4, 'ATC': 3, 'TCP_H': 2, 'DV1': 1, 'DG2': 1, 'TGP_H': 1})
How I can get (print to an output file) keys- values wihtuot the garbage?
Just this:
PHQ,4
ATC,3
TCP_H,2
DV1,1
DG2,1
TGP_H,1

Or maybe there is a better way to do this?
Thank you
You can treat it like any dict. You can read the elements with .keys(), the count value with .values(), or both together with .items()

from collections import Counter
with open ("prdcs_file.txt", 'r') as pr_host :
    c = Counter(x.split(',')[1] for x in pr_host.read().splitlines())

print('\n'.join(f"{k}, {v}" for k,v in c.items()))
Output:
DV1, 1 PHQ, 4 ATC, 3 DG2, 1 TGP_H, 1 TCP_H, 2
Or if you want them in order by greatest count, you can use most_common() in place of items()

from collections import Counter
with open ("prdcs_file.txt", 'r') as pr_host :
    c = Counter(x.split(',')[1] for x in pr_host.read().splitlines())

print('\n'.join(f"{k}, {v}" for k,v in c.most_common()))
Output:
PHQ, 4 ATC, 3 TCP_H, 2 DV1, 1 DG2, 1 TGP_H, 1
Treat it just like a regular dictionary.
import collections

values = ['DV1', 'PHQ', 'ATC', 'ATC', 'ATC', 'PHQ', 'DG2', 'TGP_H', 'PHQ', 'PHQ', 'TCP_H', 'TCP_H']
count = collections.Counter(values)

print(count)
for key, value in count.items():
    print(key, ':', value)
Output:
Counter({'PHQ': 4, 'ATC': 3, 'TCP_H': 2, 'DV1': 1, 'DG2': 1, 'TGP_H': 1}) DV1 : 1 PHQ : 4 ATC : 3 DG2 : 1 TGP_H : 1 TCP_H : 2
I'll try it tomorrow.
Thank you guys!
You rock! Wink