Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Calculate Rating Score for Reviews Containing Specific Words
#1
I want to calculate the average rating score for reviews that contain any of the words, 'food, buffet, breakfast, supper'. I would like to find out if the food quality affects the hotel rating. The code is;

# Importing Libraries 
import numpy as np   
import pandas as pd  
# Import dataset 
dataset = pd.read_csv("../output_rating.tsv", delimiter = '\t')
datatop = dataset.head()
datatop
Rating Review Liked
0 30 It would be difficult to find a hotel with a b... 0
1 50 Clean rooms. Helpful staff. Close to waterfron... 1
2 50 This hotel deserves all the accolades. We cele... 1
3 50 I stayed at the Protea with my husband and our... 1
4 30 I stayed here while touring around SA and this... 1

# library to clean data 
import re  
import collections


  
# Natural Language Tool Kit 
import nltk  
  
#nltk.download('stopwords') 
  
# to remove stopword 
from nltk.corpus import stopwords 
  
# for Stemming propose  
from nltk.stem.porter import PorterStemmer 
  
# Initialize empty array 
# to append clean text  
corpus = []  
  
# 1000 (reviews) rows to clean 
for i in range(0, 2259):  
      
    # column : "Review", row ith 
    review = re.sub('[^a-zA-Z]', ' ', dataset['Review'][i])  
      
    # convert all cases to lower cases 
    review = review.lower()  
      
    # split to array(default delimiter is " ") 
    review = review.split()  
      
    # creating PorterStemmer object to 
    # take main stem of each word 
    ps = PorterStemmer()  
      
    # loop for stemming each word 
    # in string array at ith row     
    review = [ps.stem(word) for word in review 
                if not word in set(stopwords.words('english'))] 
    
    #MEAN RATING FOR FOOD COMMENTS
    wanted = "buffet breakfast food supper"
    avgRating = 0
    cnt = collections.Counter()
    word = dataset['Review'][i]
    if word in wanted:
        cnt[word]+=1
        print(cnt)
        avgRating = avgRating + dataset['Rating'][i]
    #END RATING SCORE
                  
    # rejoin all string array elements 
    # to create back into a string 
    review = ' '.join(review)   
      
    # append each string to create 
    # array of clean text  
    corpus.append(review)  
The part of the code I expect to calculate the average score is not giving me any output. The code is
#MEAN RATING FOR FOOD COMMENTS
    wanted = "buffet breakfast food supper"
    avgRating = 0
    cnt = collections.Counter()
    word = dataset['Review'][i]
    if word in wanted:
        cnt[word]+=1
        print(cnt)
        avgRating = avgRating + dataset['Rating'][i]
 #END RATING SCORE
Quote
#2
(Nov-15-2019, 01:34 PM)bongielondy Wrote: I would like to find out if the food quality affects the hotel rating.
Moved to the data science section, since this is more of a data science/algorithm question than Python itself. If you don't end up getting a reply here, you may want to seek a forum dedicated specifically to the kind of modeling you're doing.
Feel like you're not getting the answers you want? Checkout the help/rules for things like what to include/not include in a post, how to use code tags, how to ask smart questions, and more.

Pro-tip - there's an inverse correlation between the number of lines of code posted and my enthusiasm for helping with a question :)
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Making a list for positive vs negative reviews based on rating fancy_panther 1 1,260 Mar-22-2017, 11:30 PM
Last Post: zivoni

Forum Jump:


Users browsing this thread: 1 Guest(s)