Identifying items in a csv file that also appear in a Text extract - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Identifying items in a csv file that also appear in a Text extract (/thread-134.html) Pages:
1
2
|
Identifying items in a csv file that also appear in a Text extract - Jaynorth - Sep-21-2016 I am new to Python. I have a text extract from a database and a csv wikipedia list of all countries , and I would like to check if the country is mentioned in the text and the number of times that it is mentioned. This is what I have done so far: <code> text = pd.read_sql(select_string, con) #clean up text = text.replace({'\n': ' '}, regex=True) text = text.replace({'-': ' '}, regex=True) text = text['ProductText'] print(text) #making sure it looks ok country_codes = pd.read_csv('country-codes.csv') codes = country_codes['English short name lower case'] count_occurrences=Counter(country for country in text if country in codes) print(count_occurrences)The problem is that the last piece of code is not picking up any countries at all so the output is Counter() I suspect that the problem is with the loop but I am not sure how to fix it - any help would really be appreciated :) RE: Identifying items in a csv file that also appear in a Text extract - nilamo - Sep-21-2016 What does Counter() look like? What does country_codes, and more specifically, codes, look like? Also, you should probably rename some of your variables to... match what they actually are. Like this: count_occurrences = Counter(word for word in text if word in codes)Since we're assuming that not *every* word in the article is a country code. Also, are you putting the whole text into lower/uppercase anywhere? If you're looking for "can", it wouldn't match "CAN" at all, for those poor Canadians :( RE: Identifying items in a csv file that also appear in a Text extract - wavic - Sep-21-2016 from collections import Counter counter = Counter(iterable) print(counter['item'])How does this syntax highlighting works? :huh: RE: Identifying items in a csv file that also appear in a Text extract - Jaynorth - Sep-21-2016 Counter() just returns Counter() in the console when the script is run. country_codes is the names of the csv file which is read into the script and codes is just a variable that I used to assign the relevant column in the csv file - country_codes['English short name lower case'] I am not matching on the Alpha-2 or Alpha-3 columns in the csv file which uses the 3 letter representation of the country like "CAN" XD RE: Identifying items in a csv file that also appear in a Text extract - nilamo - Sep-21-2016 (Sep-21-2016, 09:05 PM)wavic Wrote:from collections import Counter counter = Counter(iterable) print(counter['item'])How does this syntax highlighting works? :huh: Use the python syntax highlighter, not the generic code one. (they're still working out the plugins) Also, wouldn't your code just always give "0"? >>> from collections import Counter >>> cnt = Counter('Green eggs and spam') >>> cnt['g'] 2 >>> cnt['gg'] 0 >>> cnt['eggs'] 0 RE: Identifying items in a csv file that also appear in a Text extract - sparkz_alot - Sep-21-2016 You may have to supply a bit more code, such as where and what have you defined "Counter". If you are getting an error, please include the Traceback. Never mind, still getting used to new forum :angel: :P RE: Identifying items in a csv file that also appear in a Text extract - snippsat - Sep-21-2016 (Sep-21-2016, 09:05 PM)wavic Wrote:Use the Python icon sceditor,have upgraded name and color a little ;)from collections import Counter counter = Counter(iterable) print(counter['item'])How does this syntax highlighting works? :huh: RE: Identifying items in a csv file that also appear in a Text extract - Jaynorth - Sep-21-2016 (Sep-21-2016, 09:12 PM)sparkz_alot Wrote: You may have to supply a bit more code, such as where and what have you defined "Counter". If you are getting an error, please include the Traceback. Never mind, still getting used to new forum :angel: :P from collections import Counter RE: Identifying items in a csv file that also appear in a Text extract - sparkz_alot - Sep-21-2016 (Sep-21-2016, 09:16 PM)Jaynorth Wrote:(Sep-21-2016, 09:12 PM)sparkz_alot Wrote: You may have to supply a bit more code, such as where and what have you defined "Counter". If you are getting an error, please include the Traceback. Never mind, still getting used to new forum :angel: :P Thanks :) RE: Identifying items in a csv file that also appear in a Text extract - Jaynorth - Sep-21-2016 Counter(country for country in text if country in codes) (Sep-21-2016, 09:12 PM)nilamo Wrote:(Sep-21-2016, 09:05 PM)wavic Wrote:from collections import Counter counter = Counter(iterable) print(counter['item'])How does this syntax highlighting works? :huh: My interpretation of this is that : count every word for word that is in the text if the word is also in the country code csv file - is this correct? |