Python Forum
Location Named Entity Recognition Problem
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Location Named Entity Recognition Problem
#1
I'm dealing with Twitter data ,I have users in json format,I'm trying to extract location from these fields,here is some sample data

Sample data:
"location": "Georgia, USA",

"location": "El Centro, CA",

"location": "Barnaul",
"location": "heaven on earth",

The Problem:
The text in location field is not in a consistent format, it's not following any standard, for example, there are ISO codes for countries by using that, one can easily separate city, country or state, but there is no clear indication as to how to identify the text in the field as a particular location.
 
For example the texts in the location field are of these patterns
 
1) Country (ex. Canada)
This is a country but can be anything else, it's just a text, one can match that text with a list of countries, but what if it’s a city.
 
2) City (ex. Toronto)
Or it can be a city
 
3) City, Country (ex. Toronto, Canada)
City and country separated with comma or space
 
4) City, State (ex. Toronto, Ontario)
City and State separated with comma or space
 
5) Meaningless text (ex. Worldwide)
Text which is not a city, country or state
 
6) Different Language (ex 广州)
Same patterns as listed above but in a language other than English, for example, Chinese.
 
7) Abbreviations and ISO codes
  • Sometimes Countries are represented in ISO codes such as CA or CAN for Canada,
  • States as FL for Florida (U.S state),
  • City as US-MN for Minneapolis (a city in Minnesota).
Kindly guide me as to how to solve this problem,there are many libraries to choose from.
Reply


Messages In This Thread
Location Named Entity Recognition Problem - by owais - Mar-22-2017, 04:56 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Problem with pymodbus - ModuleNotFoundError: No module named 'pymodbus.client.sync' stsxbel 2 24,250 Nov-02-2023, 08:20 AM
Last Post: South_east
  Problem with Pyinstaller. No module named '_tkinter' tonynapoli2309 0 1,071 May-15-2023, 02:38 PM
Last Post: tonynapoli2309
  ISS Location Finder problem birdwatcher 7 3,720 Jan-20-2020, 07:41 PM
Last Post: birdwatcher
  Need help excluding Named Entity (NE) and proper nouns (NNE) from text analysis disruptfwd8 0 2,365 May-15-2018, 12:10 AM
Last Post: disruptfwd8
  How to search and open an error file whose entity id is stored in hbase table lravikumarvsp 2 2,897 May-08-2018, 07:39 AM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020