Python Forum
[nltk] Naive Bayes Classifier
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[nltk] Naive Bayes Classifier
#1
Hello

I use nltk.NaiveBayesClassifier in order to make opinion analysis. I have a problem.

What I do:

1. Take lists of negative and positive words, shuffle it.

2. Use Brown corpus of movie reviews

docs = [ (list(movie_reviews.words(fileid)), category)
     for category in movie_reviews.categories()
     for fileid in movie_reviews.fileids(category)]
3. Function to represent text as vector of features

def vector(doc):
    doc_words = set(doc)
    vect = {}
    for w in words: // words = pos_words + neg_words
        vect[w] = (w in doc_words)
    return vect
4. Take all labelled reviews and represent them as vectors of features ( { vector : lavel } )

5. Train classifier

>>> classifier.show_most_informative_features()
Most Informative Features
              astounding = 1                 pos : neg    =     12.3 : 1.0
             outstanding = 1                 pos : neg    =     11.5 : 1.0
               ludicrous = 1                 neg : pos    =     11.0 : 1.0
             fascination = 1                 pos : neg    =     11.0 : 1.0
               insulting = 1                 neg : pos    =     11.0 : 1.0
                   sucks = 1                 neg : pos    =     10.6 : 1.0
                seamless = 1                 pos : neg    =     10.3 : 1.0
                  hatred = 1                 pos : neg    =     10.3 : 1.0
                   dread = 1                 pos : neg    =      9.7 : 1.0
              accessible = 1                 pos : neg    =      9.7 : 1.0
TEST:

sent1 = { 'good' : 1 } \\ just one word "good"
>>> classifier.classify(sent1)
'neg'
Fail!

What is wrong?
Reply


Messages In This Thread
[nltk] Naive Bayes Classifier - by constantin01 - Jun-24-2019, 10:36 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Error on trying to use NLTK Punkt PythonDE 1 2,010 Oct-22-2020, 07:17 PM
Last Post: PythonDE
  [nltk] Parsing with CFG constantin01 1 2,055 Jul-05-2019, 11:11 PM
Last Post: Larz60+
  [nltk] Relations Extractor constantin01 3 3,473 Jun-28-2019, 10:41 AM
Last Post: constantin01
  Sentiment Analysis Classifier lode 0 1,734 Feb-04-2019, 05:00 AM
Last Post: lode
  NLTK Download Attribute error laila1a 1 8,454 Jan-27-2019, 12:03 AM
Last Post: Larz60+
  [split] serious n00b.. NLTK in python 2.7 and 3.5 kevindenman 3 4,498 Feb-22-2018, 09:05 PM
Last Post: kevindenman
  Naive Bayes too slow pythlang 22 23,755 Oct-25-2016, 01:57 AM
Last Post: pythlang

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020