Jul-18-2018, 09:02 PM
(This post was last modified: Jul-18-2018, 09:03 PM by Totalnoobwithhelp.)
So I've cobbled together a python script that, in essence, runs through a txt file and pulls out words or phrases that match a set of words/phrases that I've given it in a separate list. My question is whether or not there is a way to have python run all of the txt files (+- 700) without having to manually change the file name every time. In short, having the script run all of the txt files in a directory, hopefully in order or at least separating them so they don't all add together at the end. I've had some serious help getting this far, but I've been so far unlucky with this, and now that the project has grown exponentially (100 ---> 700 files) changing the file name manually is more and more of a pain.
File1 = open("A1.txt", "r") File2 = open("applicable_words.txt", "r") Filewords = {} badSymbols = (".", ",", "!", "?", "’", '”', '“',) totalCount = 0 applicableWords = [] for line in File2: applicableWords.append((line.lower()).strip()) for line in File1: if not line.startswith("Q:"): for word in line.split( ): s = (word.lower()) for symbol in badSymbols: s = s.replace(symbol, "") if len(s) > 1: if s in Filewords: Filewords[s] += 1 totalCount += 1 else: Filewords[s] = 1 totalCount += 1 else: totalCount += 1 frequencyList = [] for word in Filewords: if word in applicableWords: if len(frequencyList) == 0 or Filewords[word] == 1: frequencyList.append(word) else: for item in frequencyList: if not word in frequencyList and Filewords[word] >= Filewords[item]: frequencyList.insert(frequencyList.index(item), word) for value in frequencyList: print(value + ": " + str(Filewords[value])) print() print("Total number of words said: " + str(totalCount)) File1.close(); File2.close();