Posts: 7
Threads: 1
Joined: Sep 2016
I am new to Python. I have a text extract from a database and a csv wikipedia list of all countries , and I would like to check if the country is mentioned in the text and the number of times that it is mentioned. This is what I have done so far:
<code>
text = pd.read_sql(select_string, con)
#clean up
text = text.replace({'\n': ' '}, regex=True)
text = text.replace({'-': ' '}, regex=True)
text = text['ProductText']
print(text) #making sure it looks ok
country_codes = pd.read_csv('country-codes.csv')
codes = country_codes['English short name lower case']
count_occurrences=Counter(country for country in text if country in codes)
print(count_occurrences) The problem is that the last piece of code is not picking up any countries at all so the output is Counter()
I suspect that the problem is with the loop but I am not sure how to fix it - any help would really be appreciated :)
Posts: 3,458
Threads: 101
Joined: Sep 2016
What does Counter() look like? What does country_codes, and more specifically, codes, look like?
Also, you should probably rename some of your variables to... match what they actually are. Like this:
count_occurrences = Counter(word for word in text if word in codes) Since we're assuming that not *every* word in the article is a country code.
Also, are you putting the whole text into lower/uppercase anywhere? If you're looking for "can", it wouldn't match "CAN" at all, for those poor Canadians :(
Posts: 2,953
Threads: 48
Joined: Sep 2016
Sep-21-2016, 09:05 PM
(This post was last modified: Sep-21-2016, 09:18 PM by snippsat.)
from collections import Counter
counter = Counter(iterable)
print(counter['item']) How does this syntax highlighting works? :huh:
Posts: 7
Threads: 1
Joined: Sep 2016
Counter() just returns Counter() in the console when the script is run. country_codes is the names of the csv file which is read into the script and codes is just a variable that I used to assign the relevant column in the csv file - country_codes['English short name lower case']
I am not matching on the Alpha-2 or Alpha-3 columns in the csv file which uses the 3 letter representation of the country like "CAN" XD
Posts: 3,458
Threads: 101
Joined: Sep 2016
(Sep-21-2016, 09:05 PM)wavic Wrote: from collections import Counter
counter = Counter(iterable)
print(counter['item']) How does this syntax highlighting works? :huh:
Use the python syntax highlighter, not the generic code one. (they're still working out the plugins)
Also, wouldn't your code just always give "0"?
>>> from collections import Counter
>>> cnt = Counter('Green eggs and spam')
>>> cnt['g']
2
>>> cnt['gg']
0
>>> cnt['eggs']
0
Posts: 1,298
Threads: 38
Joined: Sep 2016
Sep-21-2016, 09:12 PM
(This post was last modified: Sep-21-2016, 09:13 PM by sparkz_alot.)
You may have to supply a bit more code, such as where and what have you defined "Counter". If you are getting an error, please include the Traceback. Never mind, still getting used to new forum :angel: :P
If it ain't broke, I just haven't gotten to it yet.
OS: Windows 10, openSuse 42.3, freeBSD 11, Raspian "Stretch"
Python 3.6.5, IDE: PyCharm 2018 Community Edition
Posts: 7,312
Threads: 123
Joined: Sep 2016
Sep-21-2016, 09:15 PM
(This post was last modified: Sep-21-2016, 09:15 PM by snippsat.)
(Sep-21-2016, 09:05 PM)wavic Wrote: from collections import Counter
counter = Counter(iterable)
print(counter['item']) How does this syntax highlighting works? :huh: Use the Python icon sceditor,have upgraded name and color a little ;)
Posts: 7
Threads: 1
Joined: Sep 2016
(Sep-21-2016, 09:12 PM)sparkz_alot Wrote: You may have to supply a bit more code, such as where and what have you defined "Counter". If you are getting an error, please include the Traceback. Never mind, still getting used to new forum :angel: :P
from collections import Counter
Posts: 1,298
Threads: 38
Joined: Sep 2016
(Sep-21-2016, 09:16 PM)Jaynorth Wrote: (Sep-21-2016, 09:12 PM)sparkz_alot Wrote: You may have to supply a bit more code, such as where and what have you defined "Counter". If you are getting an error, please include the Traceback. Never mind, still getting used to new forum :angel: :P
from collections import Counter
Thanks :)
If it ain't broke, I just haven't gotten to it yet.
OS: Windows 10, openSuse 42.3, freeBSD 11, Raspian "Stretch"
Python 3.6.5, IDE: PyCharm 2018 Community Edition
Posts: 7
Threads: 1
Joined: Sep 2016
Counter(country for country in text if country in codes) (Sep-21-2016, 09:12 PM)nilamo Wrote: (Sep-21-2016, 09:05 PM)wavic Wrote: from collections import Counter
counter = Counter(iterable)
print(counter['item']) How does this syntax highlighting works? :huh:
Use the python syntax highlighter, not the generic code one. (they're still working out the plugins)
Also, wouldn't your code just always give "0"?
>>> from collections import Counter
>>> cnt = Counter('Green eggs and spam')
>>> cnt['g']
2
>>> cnt['gg']
0
>>> cnt['eggs']
0
My interpretation of this is that : count every word for word that is in the text if the word is also in the country code csv file - is this correct?
|