Nov-15-2019, 01:34 PM
I want to calculate the average rating score for reviews that contain any of the words, 'food, buffet, breakfast, supper'. I would like to find out if the food quality affects the hotel rating. The code is;
0 30 It would be difficult to find a hotel with a b... 0
1 50 Clean rooms. Helpful staff. Close to waterfron... 1
2 50 This hotel deserves all the accolades. We cele... 1
3 50 I stayed at the Protea with my husband and our... 1
4 30 I stayed here while touring around SA and this... 1
# Importing Libraries import numpy as np import pandas as pd # Import dataset dataset = pd.read_csv("../output_rating.tsv", delimiter = '\t') datatop = dataset.head() datatopRating Review Liked
0 30 It would be difficult to find a hotel with a b... 0
1 50 Clean rooms. Helpful staff. Close to waterfron... 1
2 50 This hotel deserves all the accolades. We cele... 1
3 50 I stayed at the Protea with my husband and our... 1
4 30 I stayed here while touring around SA and this... 1
# library to clean data import re import collections # Natural Language Tool Kit import nltk #nltk.download('stopwords') # to remove stopword from nltk.corpus import stopwords # for Stemming propose from nltk.stem.porter import PorterStemmer # Initialize empty array # to append clean text corpus = [] # 1000 (reviews) rows to clean for i in range(0, 2259): # column : "Review", row ith review = re.sub('[^a-zA-Z]', ' ', dataset['Review'][i]) # convert all cases to lower cases review = review.lower() # split to array(default delimiter is " ") review = review.split() # creating PorterStemmer object to # take main stem of each word ps = PorterStemmer() # loop for stemming each word # in string array at ith row review = [ps.stem(word) for word in review if not word in set(stopwords.words('english'))] #MEAN RATING FOR FOOD COMMENTS wanted = "buffet breakfast food supper" avgRating = 0 cnt = collections.Counter() word = dataset['Review'][i] if word in wanted: cnt[word]+=1 print(cnt) avgRating = avgRating + dataset['Rating'][i] #END RATING SCORE # rejoin all string array elements # to create back into a string review = ' '.join(review) # append each string to create # array of clean text corpus.append(review)The part of the code I expect to calculate the average score is not giving me any output. The code is
#MEAN RATING FOR FOOD COMMENTS wanted = "buffet breakfast food supper" avgRating = 0 cnt = collections.Counter() word = dataset['Review'][i] if word in wanted: cnt[word]+=1 print(cnt) avgRating = avgRating + dataset['Rating'][i] #END RATING SCORE