Python Forum
Location Named Entity Recognition Problem
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Location Named Entity Recognition Problem
#1
I'm dealing with Twitter data ,I have users in json format,I'm trying to extract location from these fields,here is some sample data

Sample data:
"location": "Georgia, USA",

"location": "El Centro, CA",

"location": "Barnaul",
"location": "heaven on earth",

The Problem:
The text in location field is not in a consistent format, it's not following any standard, for example, there are ISO codes for countries by using that, one can easily separate city, country or state, but there is no clear indication as to how to identify the text in the field as a particular location.
 
For example the texts in the location field are of these patterns
 
1) Country (ex. Canada)
This is a country but can be anything else, it's just a text, one can match that text with a list of countries, but what if it’s a city.
 
2) City (ex. Toronto)
Or it can be a city
 
3) City, Country (ex. Toronto, Canada)
City and country separated with comma or space
 
4) City, State (ex. Toronto, Ontario)
City and State separated with comma or space
 
5) Meaningless text (ex. Worldwide)
Text which is not a city, country or state
 
6) Different Language (ex 广州)
Same patterns as listed above but in a language other than English, for example, Chinese.
 
7) Abbreviations and ISO codes
  • Sometimes Countries are represented in ISO codes such as CA or CAN for Canada,
  • States as FL for Florida (U.S state),
  • City as US-MN for Minneapolis (a city in Minnesota).
Kindly guide me as to how to solve this problem,there are many libraries to choose from.
Reply
#2
Where is this file located?

What you've presented is not of any value.

If there is no structure to the data, how do you intend to make order out of chaos?

I believe that if someone went to the effort of creating a json file, that there must be structure to the data.
Reply
#3
(Mar-22-2017, 05:44 AM)Larz60+ Wrote: Where is this file located? What you've presented is not of any value. If there is no structure to the data, how do you intend to make order out of chaos? I believe that if someone went to the effort of creating a json file, that there must be structure to the data.

Here is the link of twitter followers dump i have collected (it has 3000 followers)
You can use this execllent tool to view the file in a tree view
http://jsonviewer.stack.hu/
There is a "Text" -> "load json data" option which can load json from url.

In that json file I'm interested in location property(field/variable) for now,that was the data I was referring to in the question.
Reply
#4
That's just a json viewer. Where is the data file?
Quote:In that json file I'm interested in location property(field/variable) for now
Where is that json file??
Reply
#5
(Mar-22-2017, 01:05 PM)Larz60+ Wrote: That's just a json viewer. Where is the data file?
Quote: In that json file I'm interested in location property(field/variable) for now
Where is that json file??

Sorry my bad
https://gist.githubusercontent.com/Owais...owers.json
Reply
#6
Let me examine the file and see what I can figure out.
Reply
#7
This looks like a standard json file.
so you should be able to read it in with json.load
then you will have a list of dictionary entries.
to get the individual elements, you can use for key, value in (json data structure).items():
Reply
#8
This tool might help: JSON Formatter
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Problem with pymodbus - ModuleNotFoundError: No module named 'pymodbus.client.sync' stsxbel 2 22,953 Nov-02-2023, 08:20 AM
Last Post: South_east
  Problem with Pyinstaller. No module named '_tkinter' tonynapoli2309 0 934 May-15-2023, 02:38 PM
Last Post: tonynapoli2309
  ISS Location Finder problem birdwatcher 7 3,563 Jan-20-2020, 07:41 PM
Last Post: birdwatcher
  Need help excluding Named Entity (NE) and proper nouns (NNE) from text analysis disruptfwd8 0 2,324 May-15-2018, 12:10 AM
Last Post: disruptfwd8
  How to search and open an error file whose entity id is stored in hbase table lravikumarvsp 2 2,806 May-08-2018, 07:39 AM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020