Hello. I have some data such as this.
aaa
bbb
ccc
aaa
ccc
ddd
fff
aaa
ccc
aaa
I know using the set will give the unique values but what i need is the unique values and the count. for example:
aaa 4
bbb 1
ccc 3
ddd 1
Batteries included ...
from collections import Counter
your_input = """aaa
bbb
ccc
aaa
ccc
ddd
fff
aaa
ccc
aaa"""
sequence = your_input.strip().split()
counter = Counter(sequence)
print(counter.most_common(5))
Or if you want to know how to do it:
your_input = """aaa
bbb
ccc
aaa
ccc
ddd
fff
aaa
ccc
aaa"""
counts = {}
for entry in your_input.strip().split()
counts[entry] = counts.get(entry, 0) + 1
from collections import Counter
input = ['a', 'a', 'b', 'b', 'b']
c = Counter( input )
print( c.items() )
.
If all you want to do is find the counts (and not do any further processing in a Python script), then if you're on UNIX, you don't need Python at all. Command line tools will do the job:
Output:
$ cat << EOF | sort | uniq -c
aaa
bbb
ccc
aaa
ccc
ddd
fff
aaa
ccc
aaa
EOF
4 aaa
1 bbb
3 ccc
1 ddd
1 fff
Obviously you can read from a file if you needed to, but I've used a
here document to pass the input to
cat
.
uniq
's
-c
option provides the counts but the program looks at adjacent lines only to filter out duplicates, so
sort
is necessary to put them next to each other.