Oct-21-2016, 09:22 PM
So, after fooling around with this algorithm I've noticed that it's entirely too slow since it's a learning kit, especially for analyzing large sets of data.
I want to be able to retain the function of Naive Bayes without the insane amount of time it takes to process.
Can I use scikitlearn as a wrapper of some sort instead?
That seems like it would be better equipped to deal with the problem.
Here's my code, feel free to make revisions in addition to helping me speed up the processing time:
I want to be able to retain the function of Naive Bayes without the insane amount of time it takes to process.
Can I use scikitlearn as a wrapper of some sort instead?
That seems like it would be better equipped to deal with the problem.
Here's my code, feel free to make revisions in addition to helping me speed up the processing time:
import nltk import random from nltk.corpus import movie_reviews documents = [(list(movie_reviews.words(fileid)), category) for category in movie_reviews.categories() for fileid in movie_reviews.fileids(category)] random.shuffle(documents) all_words = [] for w in movie_reviews.words(): all_words.append(w.lower()) all_words = nltk.FreqDist(all_words) word_features = list(all_words.keys())[:3000] def find_features(document): words = set(document) features = {} for w in word_features: features[w] = (w in words) return features print((find_features(movie_reviews.words('neg/cv000_29416.txt')))) featuresets = [(find_features(rev), category) for (rev, category) in documents] training_set = featuresets[:1900] testing_set = featuresets[:1900:] classifier = nltk.NaiveBayesClassifier.train(training_set) print("Naive Bayes Algo accuracy percent:", (nltk.classify.accuracy(classifier, testing_set))*100) classifier.show_most_informative_features(15)
Output:[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback']False, u'effected': False, u'compared': False, u'nonetheless': False, u'deadly': False, u'purproses': False, u'lately': False, u'kerrigans': False, u'compares': False, u'details': False, u'behold': False, u'vulgarize': False, u'illusion': False, u'ponytail': False, u'rebelled': False, u'repeat': False, u'zhou': False, u'treason': False, u'allotting': False, u'impregnating': False, u'tinier': False, u'trunchbull': False, u'laude': False, u'exposure': False, u'searches': False, u'ustinov': False, u'disatisfaction': False, u'mishears': False, u'torrid': False, u'compete': False, u'lestat': False, u'villainous': False, u'searched': False, u'gardens': False, u'homerian': False}[/font][/size][/font][/size]
[/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback']('Naive Bayes Algo accuracy percent:', 87.78947368421053)[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback']Most Informative Features[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] insulting = True neg : pos = 10.6 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] sans = True neg : pos = 8.4 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] wasting = True neg : pos = 8.4 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] refreshingly = True pos : neg = 8.3 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] mediocrity = True neg : pos = 7.7 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] dismissed = True pos : neg = 7.0 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] bruckheimer = True neg : pos = 6.3 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] sumptuous = True pos : neg = 6.3 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] cronenberg = True pos : neg = 6.3 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] fabric = True pos : neg = 6.3 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] ugh = True neg : pos = 5.8 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] doubts = True pos : neg = 5.8 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] bounce = True neg : pos = 5.7 : 1.0[/font][/size][/font][/size][/color]
[color=#333333][size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'] wires = True neg : pos = 5.7 : 1.0[/font][/size][/font][/size][/color]
[size=small][font=-apple-system, BlinkMacSystemFont, 'Segoe WPC', 'Segoe UI', HelveticaNeue-Light, Ubuntu, 'Droid Sans', sans-serif][size=x-small][font=Monaco, Menlo, Consolas, 'Droid Sans Mono', Inconsolata, 'Courier New', monospace, 'Droid Sans Fallback'][color=#333333] wits = True pos : neg = 5.7 : 1.0[/color][/font][/size][/font][/size]