Python Forum

Full Version: How can I access the corpus
You're currently viewing a stripped down version of our content. View the full version with proper formatting.


Hello,

I created a corpus for a number of documents as:

from gensim import models, corpora

corpus = corpora.BleiCorpus('./data/ap/ap.dat', './data/ap/vocab.txt')
Now I want to access its documents for comparison purposes. How can I do that?

Thanks
There are many tutorials here: https://radimrehurek.com/gensim/tutorial.html
This all appears quite new, only showing up under PyPi Nov 11, 2017.
You can probably also load the corpora into NLTK which is a very mature
and very well documented Natural Language Processing package