trying to understand the python code - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: trying to understand the python code (/thread-8752.html) |
trying to understand the python code - AkashDubey - Mar-05-2018 I am new to Machine Learning and python. Recently i have been working with Amazon fine food review data from kaggle and its code. What i don't understand is how is the 'partiton' method used here ? Moreover, What actually does last 3 lines of code do ? %matplotlib inline import sqlite3 import pandas as pd import numpy as np import nltk import string import matplotlib.pyplot as plt import seaborn as sns from sklearn.feature_extraction.text import TfidfTransformer from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.feature_extraction.text import CountVectorizer from sklearn.metrics import confusion_matrix from sklearn import metrics from sklearn.metrics import roc_curve, auc from nltk.stem.porter import PorterStemmer # using the SQLite Table to read data. con = sqlite3.connect('./amazon-fine-food-reviews/database.sqlite') #filtering only positive and negative reviews i.e. # not taking into consideration those reviews with Score=3 filtered_data = pd.read_sql_query(""" SELECT * FROM Reviews WHERE Score != 3 """, con) # Give reviews with Score>3 a positive rating, and reviews with a # score<3 a negative rating. def partition(x): if x < 3: return 'negative' return 'positive' #changing reviews with score less than 3 to be positive vice-versa actualScore = filtered_data['Score'] positiveNegative = actualScore.map(partition) filtered_data['Score'] = positiveNegative RE: trying to understand the python code - buran - Mar-06-2018 1.actualScore will be just the column Score from the dataframe2. actualScore.map(partition) will apply (i.e. map) function partition to every element of the actualScore, creating positiveNegative 3. filtered_data['Score'] = positiveNegative will replace values from column Score in the dataframe with values from positiveNegative as a result the dataframe Score column will have just values positive (i.e. original score>=3) and negativee (i.e. original Score<3) |