[nltk] Naive Bayes Classifier

constantin01 · Jun-24-2019, 10:36 AM

Hello

I use nltk.NaiveBayesClassifier in order to make opinion analysis. I have a problem.

What I do:

1. Take lists of negative and positive words, shuffle it.

2. Use Brown corpus of movie reviews

docs = [ (list(movie_reviews.words(fileid)), category)
     for category in movie_reviews.categories()
     for fileid in movie_reviews.fileids(category)]

3. Function to represent text as vector of features

def vector(doc):
    doc_words = set(doc)
    vect = {}
    for w in words: // words = pos_words + neg_words
        vect[w] = (w in doc_words)
    return vect

4. Take all labelled reviews and represent them as vectors of features ( { vector : lavel } )

5. Train classifier

>>> classifier.show_most_informative_features()
Most Informative Features
              astounding = 1                 pos : neg    =     12.3 : 1.0
             outstanding = 1                 pos : neg    =     11.5 : 1.0
               ludicrous = 1                 neg : pos    =     11.0 : 1.0
             fascination = 1                 pos : neg    =     11.0 : 1.0
               insulting = 1                 neg : pos    =     11.0 : 1.0
                   sucks = 1                 neg : pos    =     10.6 : 1.0
                seamless = 1                 pos : neg    =     10.3 : 1.0
                  hatred = 1                 pos : neg    =     10.3 : 1.0
                   dread = 1                 pos : neg    =      9.7 : 1.0
              accessible = 1                 pos : neg    =      9.7 : 1.0

TEST:

sent1 = { 'good' : 1 } \\ just one word "good"
>>> classifier.classify(sent1)
'neg'

Fail!

What is wrong?

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Error on trying to use NLTK Punkt	PythonDE	1	2,072	Oct-22-2020, 07:17 PM Last Post: PythonDE
	[nltk] Parsing with CFG	constantin01	1	2,116	Jul-05-2019, 11:11 PM Last Post: Larz60+
	[nltk] Relations Extractor	constantin01	3	3,532	Jun-28-2019, 10:41 AM Last Post: constantin01
	Sentiment Analysis Classifier	lode	0	1,781	Feb-04-2019, 05:00 AM Last Post: lode
	NLTK Download Attribute error	laila1a	1	8,623	Jan-27-2019, 12:03 AM Last Post: Larz60+
	[split] serious n00b.. NLTK in python 2.7 and 3.5	kevindenman	3	4,555	Feb-22-2018, 09:05 PM Last Post: kevindenman
	Naive Bayes too slow	pythlang	22	24,168	Oct-25-2016, 01:57 AM Last Post: pythlang

[nltk] Naive Bayes Classifier

User Panel Messages

Announcements