I am new to Machine Learning and python. Recently i have been working with Amazon fine food review data from kaggle and its code.
What i don't understand is how is the 'partiton' method used here ?
Moreover, What actually does last 3 lines of code do ?
What i don't understand is how is the 'partiton' method used here ?
Moreover, What actually does last 3 lines of code do ?
%matplotlib inline import sqlite3 import pandas as pd import numpy as np import nltk import string import matplotlib.pyplot as plt import seaborn as sns from sklearn.feature_extraction.text import TfidfTransformer from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.feature_extraction.text import CountVectorizer from sklearn.metrics import confusion_matrix from sklearn import metrics from sklearn.metrics import roc_curve, auc from nltk.stem.porter import PorterStemmer # using the SQLite Table to read data. con = sqlite3.connect('./amazon-fine-food-reviews/database.sqlite') #filtering only positive and negative reviews i.e. # not taking into consideration those reviews with Score=3 filtered_data = pd.read_sql_query(""" SELECT * FROM Reviews WHERE Score != 3 """, con) # Give reviews with Score>3 a positive rating, and reviews with a # score<3 a negative rating. def partition(x): if x < 3: return 'negative' return 'positive' #changing reviews with score less than 3 to be positive vice-versa actualScore = filtered_data['Score'] positiveNegative = actualScore.map(partition) filtered_data['Score'] = positiveNegative