Python Forum

Full Version: [nltk] Relations Extractor
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello

I use nltk.sem.relextract to extract relations in text. I have a problem.

What I do:

1. Tools:

import nltk
import re 
from nltk.chunk import ne_chunk_sents
from nltk.sem import relextract
2. Take simple sentence and prepare it

sent = "China is in Asia"
sent = chunker(nltk.pos_tag(sent.split())) // tokenizing, tagging, chunking in NEs
sent
Tree('S', [Tree('GPE', [('China', 'NNP')]), ('in', 'IN'), Tree('GPE', [('Asia', 'NNP')])])
3. Pattern:

IN = re.compile (r'.*\bin\b(?!\b.+ing)')
4. Processing:

for rel in  nltk.sem.extract_rels('GPE','GPE',sent,corpus='ace',pattern=IN):
   print(nltk.sem.relextract.rtuple(rel))
[]
Fail! What is wrong?
It seems that your question is related with the following issue. Nevertheless, your problem isn't fully reproducible, because the chunker function isn't defined.
chunker it is just nltk.ne_chunk()

I have read this post before I created this thread, however it is does not work. I do all how it is explained in "Natural Language Processing with Python" (p. 284-290).
I have solved the problem with editing of code of package nltk. Now it works.

But still I wonder why I should do it in order to receive a result with such simple expample as "China is in Asia"? Why the author does not fix it?