Jun-14-2017, 05:53 PM
Hi , Appreciate for any help if possible please, I am newbie and trying to get "connectives" using custom corpora , a file named my_text.txt, which contains text as - "14States the of and to a in that is"
I am note sure if my understanding is correct? I am getting error at "print(wordlists.words('connectives')) "OSError: No such file or directory: '/Users/XXX/nltk_data/corpora/gutenberg_MINE/connectives'" i was expecting NLTK api will find connectives but as per error it seems code is expecting file or directory it is quite confusing to me . Any help will highly appreciated.
Many thanks
Kind Regards
raky
Sample Code :
from nltk.corpus import PlaintextCorpusReader
corpus_root = "/Users/XXX/nltk_data/corpora/gutenberg_MINE"
wordlists = PlaintextCorpusReader(corpus_root, "my_text.txt")
print(wordlists.words('connectives'))
Reference Code :
>>> from nltk.corpus import PlaintextCorpusReader
>>> corpus_root = ’/usr/share/dict’
>>> wordlists = PlaintextCorpusReader(corpus_root, ’.*’)
>>> wordlists.files()
(’README’, ’connectives’, ’propernames’, ’web2’, ’web2a’, ’words’)
>>> wordlists.words(’connectives’)
[’the’, ’of’, ’and’, ’to’, ’a’, ’in’, ’that’, ’is’, ...]
I am note sure if my understanding is correct? I am getting error at "print(wordlists.words('connectives')) "OSError: No such file or directory: '/Users/XXX/nltk_data/corpora/gutenberg_MINE/connectives'" i was expecting NLTK api will find connectives but as per error it seems code is expecting file or directory it is quite confusing to me . Any help will highly appreciated.
Many thanks
Kind Regards
raky
Sample Code :
from nltk.corpus import PlaintextCorpusReader
corpus_root = "/Users/XXX/nltk_data/corpora/gutenberg_MINE"
wordlists = PlaintextCorpusReader(corpus_root, "my_text.txt")
print(wordlists.words('connectives'))
Reference Code :
>>> from nltk.corpus import PlaintextCorpusReader
>>> corpus_root = ’/usr/share/dict’
>>> wordlists = PlaintextCorpusReader(corpus_root, ’.*’)
>>> wordlists.files()
(’README’, ’connectives’, ’propernames’, ’web2’, ’web2a’, ’words’)
>>> wordlists.words(’connectives’)
[’the’, ’of’, ’and’, ’to’, ’a’, ’in’, ’that’, ’is’, ...]