![]() |
get two characters, count and print from a .txt file - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: get two characters, count and print from a .txt file (/thread-30068.html) |
get two characters, count and print from a .txt file - Pleiades - Oct-03-2020 Hi all, I'm back anyway The program I wrote is terrible so if anyone can make a different py and give me some pointers well thanks a bunch ![]() What I would like help on is to print only the first two characters of a word and if those same characters repeat hold them to a count. It's like a Zipf distribution but for the first two letters for every word. The example is how I would like the output. Here is an example from this text and its for show! text file = "Here is an example from this text and its for show!" an 2 He 1 is 1 ex 1 fr 1 th 1 te 1 it 1 fo 1 sh 1 total 11 file = open("C:\python37\paradise.txt", 'r') while 1: # read by character char = file.read(2) if not char: break print(char) file.close() RE: get two characters, count and print from a .txt file - buran - Oct-03-2020 cross-posted at StackOverflow RE: get two characters, count and print from a .txt file - Pleiades - Oct-03-2020 @buran, I had no idea that would be illegal I'm just asking for help of a nice person? RE: get two characters, count and print from a .txt file - buran - Oct-03-2020 it's not illegal, but we expect the courtesy to let us know about it. This is valid not only here but basically on any specialized forum. If you don't understand why - read https://meta.stackexchange.com/q/141823 RE: get two characters, count and print from a .txt file - Pleiades - Oct-03-2020 @buran solved at stack-exchange # dictionary to store count of each word (2 characters) eg. "an": 2 wordDict = {} file = open("paradise.txt", 'r') # read each line in file for line in file: # read each word in line for word in line.split(): # get only first two letters of word word = word[:2] # If word is not in dictionary then add it if word not in wordDict: wordDict[word] = 1 # else increment the count else: wordDict[word] += 1 file.close() # print all values for key, val in wordDict.items(): print(key, val) # print total print(f"Total: {sum(wordDict.values())}") RE: get two characters, count and print from a .txt file - buran - Oct-03-2020 See- now you already have accepted answer on SO. Any attempt of a member here to help will be a waste of time if we didn't know about cross-posting. RE: get two characters, count and print from a .txt file - Pleiades - Oct-03-2020 @buran, There is a set to solved button and is now checked, relax ![]() RE: get two characters, count and print from a .txt file - buran - Oct-03-2020 Don't worry, I am relaxed. And you know what to do next time when cross-post, right? RE: get two characters, count and print from a .txt file - sandeep_ganga - Oct-03-2020 I tried something as below, see if it helps file = open('h.txt', 'r') for each in file: print (each) k=each.split() ###print(k) j=[] for i in k: ### print(i[:2]) j.append(i[:2]) ###print(j) result = dict((k, j.count(k)) for k in j) print(result) print("total--",sum(result.values())) file.close() Best Regards,Sandeep GANGA SANDEEP KUMAR RE: get two characters, count and print from a .txt file - perfringo - Oct-05-2020 There is ambiguity in this task. What should happen if 'a' encountered (totally realistic scenario in english)? Also, should 'he' and 'He' be considered as different? This solution is needlessly complicated. As mentioned in SO post there is Counter in collections built-in module which is specifically for counting. So code can be as simple as 'count two first letters of word for every word on every line': from collections import Counter # Content of the file: Here is an example from this text and its for show! with open('two_chars_count.csv', 'r') as f: count = Counter(word[:2].lower() for line in f for word in line.split()) print(*(f'{k}: {v}' for k, v in count.items()), sep='\n') he: 1 is: 1 an: 2 ex: 1 fr: 1 th: 1 te: 1 it: 1 fo: 1 sh: 1 |