Python Forum
get two characters, count and print from a .txt file - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: get two characters, count and print from a .txt file (/thread-30068.html)



get two characters, count and print from a .txt file - Pleiades - Oct-03-2020

Hi all, I'm back anyway

The program I wrote is terrible so if anyone can make a different py and give me some pointers well thanks a bunch Wall

What I would like help on is to print only the first two characters of a word and if those same characters repeat hold them to a count. It's like a Zipf distribution but for the first two letters for every word. The example is how I would like the output. Here is an example from this text and its for show!

text file = "Here is an example from this text and its for show!"

an 2
He 1
is 1
ex 1
fr 1
th 1
te 1
it 1
fo 1
sh 1
total 11

file = open("C:\python37\paradise.txt", 'r') 
  
while 1: 
      
    # read by character 
    char = file.read(2)
    if not char:  
        break
          
    print(char) 
  
file.close() 



RE: get two characters, count and print from a .txt file - buran - Oct-03-2020

cross-posted at StackOverflow


RE: get two characters, count and print from a .txt file - Pleiades - Oct-03-2020

@buran,

I had no idea that would be illegal I'm just asking for help of a nice person?


RE: get two characters, count and print from a .txt file - buran - Oct-03-2020

it's not illegal, but we expect the courtesy to let us know about it. This is valid not only here but basically on any specialized forum. If you don't understand why - read https://meta.stackexchange.com/q/141823


RE: get two characters, count and print from a .txt file - Pleiades - Oct-03-2020

@buran
solved at stack-exchange

# dictionary to store count of each word (2 characters) eg. "an": 2
wordDict = {}

file = open("paradise.txt", 'r')
# read each line in file
for line in file:
    # read each word in line
    for word in line.split():
        # get only first two letters of word
        word = word[:2]
        # If word is not in dictionary then add it
        if word not in wordDict:
          wordDict[word] = 1
        # else increment the count
        else:
          wordDict[word] += 1

file.close()

# print all values
for key, val in wordDict.items():
  print(key, val)

# print total
print(f"Total: {sum(wordDict.values())}")



RE: get two characters, count and print from a .txt file - buran - Oct-03-2020

See- now you already have accepted answer on SO. Any attempt of a member here to help will be a waste of time if we didn't know about cross-posting.


RE: get two characters, count and print from a .txt file - Pleiades - Oct-03-2020

@buran,

There is a set to solved button and is now checked, relax Huh


RE: get two characters, count and print from a .txt file - buran - Oct-03-2020

Don't worry, I am relaxed. And you know what to do next time when cross-post, right?


RE: get two characters, count and print from a .txt file - sandeep_ganga - Oct-03-2020

I tried something as below, see if it helps

file = open('h.txt', 'r') 
   
for each in file: 
    print (each) 
k=each.split()       
###print(k)
j=[]
for i in k:
   ### print(i[:2])
    j.append(i[:2])

###print(j)
result = dict((k, j.count(k)) for k in j)
print(result)
print("total--",sum(result.values()))

file.close()
Output:
h.txt input file: Here is an example from this text and its for show! py g.py Here is an example from this text and its for show! {'He': 1, 'is': 1, 'an': 2, 'ex': 1, 'fr': 1, 'th': 1, 'te': 1, 'it': 1, 'fo': 1, 'sh': 1} total-- 11
Best Regards,
Sandeep

GANGA SANDEEP KUMAR


RE: get two characters, count and print from a .txt file - perfringo - Oct-05-2020

There is ambiguity in this task. What should happen if 'a' encountered (totally realistic scenario in english)? Also, should 'he' and 'He' be considered as different?

This solution is needlessly complicated. As mentioned in SO post there is Counter in collections built-in module which is specifically for counting. So code can be as simple as 'count two first letters of word for every word on every line':

from collections import Counter

# Content of the file: Here is an example from this text and its for show!

with open('two_chars_count.csv', 'r') as f:
    count = Counter(word[:2].lower() for line in f for word in line.split())

print(*(f'{k}: {v}' for k, v in count.items()), sep='\n')

he: 1
is: 1
an: 2
ex: 1
fr: 1
th: 1
te: 1
it: 1
fo: 1
sh: 1