Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Low accuracy for fake news detection model
#1
I am trying to implement a model for fake news detection. The dataset I am using has been taken from this source :
https://drive.google.com/file/d/1er9NJTL...4a-_q/view

I am getting around 82% accuracy which is low compared to the other people models. Is there a better way to improve the accuracy of my model?


import numpy as np
import pandas as pd
import itertools
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix

df=pd.read_csv('news.csv')
print(df.head())

labels=df.label
features = df['text']

x_train,x_test,y_train,y_test=train_test_split(features , labels, test_size=0.2, random_state=7)

tfidf_vectorizer=TfidfVectorizer(stop_words="english",max_df=0.7, analyzer='word',sublinear_tf = True)

tfidf_train=tfidf_vectorizer.fit_transform(x_train) 
tfidf_test=tfidf_vectorizer.transform(x_test)

tfidf_test.shape

from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(tfidf_train, y_train)

y_pred=clf.predict(tfidf_test)
score=accuracy_score(y_test,y_pred)
print(f'Accuracy: {round(score*100,2)}%')

confusion_matrix(y_test,y_pred, labels=['FAKE','REAL'])


Output :
Output:
Accuracy: 82.24% array([[419, 219], [ 6, 623]], dtype=int64)
Quote
#2
When it comes to accuracy you have to try different methods and tune the parameters to see what works better. In this dataset, the Multinomial Naive Bayes classifier does not perform well. After trying different models, Passive-aggressive classifier gave 93% accuracy.
The implementation of Passive aggressive classifier is given below:

import numpy as np
import pandas as pd
import itertools
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.linear_model import PassiveAggressiveClassifier

df=pd.read_csv('news.csv')
#print(df.head())

labels=df.label
features = df['text']

x_train,x_test,y_train,y_test=train_test_split(features , labels, test_size=0.2, random_state=7)

tfidf_vectorizer=TfidfVectorizer(stop_words="english",max_df=0.7, analyzer='word',sublinear_tf = True)

tfidf_train=tfidf_vectorizer.fit_transform(x_train) 
tfidf_test=tfidf_vectorizer.transform(x_test)

pac=PassiveAggressiveClassifier(max_iter=50)
pac.fit(tfidf_train,y_train)

y_pred=pac.predict(tfidf_test)
score=accuracy_score(y_test,y_pred)
print(f'Accuracy: {round(score*100,2)}%')

confusion_matrix(y_test,y_pred, labels=['FAKE','REAL'])
Output:
Accuracy: 93.37% array([[594, 44], [ 40, 589]], dtype=int64)
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Loss and Accuracy Figures. Hani 3 322 May-20-2020, 06:55 PM
Last Post: jefsummers
  Best Accuracy From Loop. AhmadMWaddah 4 292 Mar-17-2020, 10:25 PM
Last Post: stullis
  Why is my train and test accuracy so low? python420 0 250 Dec-08-2019, 08:51 PM
Last Post: python420

Forum Jump:


Users browsing this thread: 1 Guest(s)