Python Forum
Low accuracy for fake news detection model
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Low accuracy for fake news detection model
#1
I am trying to implement a model for fake news detection. The dataset I am using has been taken from this source :
https://drive.google.com/file/d/1er9NJTL...4a-_q/view

I am getting around 82% accuracy which is low compared to the other people models. Is there a better way to improve the accuracy of my model?


import numpy as np
import pandas as pd
import itertools
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix

df=pd.read_csv('news.csv')
print(df.head())

labels=df.label
features = df['text']

x_train,x_test,y_train,y_test=train_test_split(features , labels, test_size=0.2, random_state=7)

tfidf_vectorizer=TfidfVectorizer(stop_words="english",max_df=0.7, analyzer='word',sublinear_tf = True)

tfidf_train=tfidf_vectorizer.fit_transform(x_train) 
tfidf_test=tfidf_vectorizer.transform(x_test)

tfidf_test.shape

from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(tfidf_train, y_train)

y_pred=clf.predict(tfidf_test)
score=accuracy_score(y_test,y_pred)
print(f'Accuracy: {round(score*100,2)}%')

confusion_matrix(y_test,y_pred, labels=['FAKE','REAL'])
Output :
Output:
Accuracy: 82.24% array([[419, 219], [ 6, 623]], dtype=int64)
Reply
#2
When it comes to accuracy you have to try different methods and tune the parameters to see what works better. In this dataset, the Multinomial Naive Bayes classifier does not perform well. After trying different models, Passive-aggressive classifier gave 93% accuracy.
The implementation of Passive aggressive classifier is given below:

import numpy as np
import pandas as pd
import itertools
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.linear_model import PassiveAggressiveClassifier

df=pd.read_csv('news.csv')
#print(df.head())

labels=df.label
features = df['text']

x_train,x_test,y_train,y_test=train_test_split(features , labels, test_size=0.2, random_state=7)

tfidf_vectorizer=TfidfVectorizer(stop_words="english",max_df=0.7, analyzer='word',sublinear_tf = True)

tfidf_train=tfidf_vectorizer.fit_transform(x_train) 
tfidf_test=tfidf_vectorizer.transform(x_test)

pac=PassiveAggressiveClassifier(max_iter=50)
pac.fit(tfidf_train,y_train)

y_pred=pac.predict(tfidf_test)
score=accuracy_score(y_test,y_pred)
print(f'Accuracy: {round(score*100,2)}%')

confusion_matrix(y_test,y_pred, labels=['FAKE','REAL'])
Output:
Accuracy: 93.37% array([[594, 44], [ 40, 589]], dtype=int64)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  LSTM Model accuracy caps and I can't improve it celinafregoso99 1 1,950 Dec-19-2020, 01:29 PM
Last Post: jefsummers
  Increasing validation accuracy on a CNN hobbyist 4 4,092 Jun-23-2020, 01:15 PM
Last Post: hussainmujtaba
  Loss and Accuracy Figures. Hani 3 2,982 May-20-2020, 06:55 PM
Last Post: jefsummers
  Best Accuracy From Loop. AhmadMWaddah 4 2,408 Mar-17-2020, 10:25 PM
Last Post: stullis
  Why is my train and test accuracy so low? python420 0 2,033 Dec-08-2019, 08:51 PM
Last Post: python420

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020