Nov-06-2019, 01:33 AM
Thank you. I have updated the code to
wife visit johannesburg famili function stay locat citi never felt comfort secur stay african pride melros arch mani restaur importantli servic attent staff african pride provid except cannot rave enough stay accommod would use citi world class
wife visit johannesburg famili function stay locat citi never felt comfort secur stay african pride melros arch mani restaur importantli servic attent staff african pride provid except cannot rave enough stay accommod would use citi world class
for i in range(0, 1000): # column : "Review", row ith try: review = re.sub('[^a-zA-Z]', ' ', dataset['Review'][i]) # convert all cases to lower cases review = review.lower() # split to array(default delimiter is " ") review = review.split() # creating PorterStemmer object to # take main stem of each word ps = PorterStemmer() # loop for stemming each word # in string array at ith row review = [ps.stem(word) for word in review if not word in set(stopwords.words('english'))] # rejoin all string array elements # to create back into a string review = ' '.join(review) # append each string to create # array of clean text corpus.append(review) except KeyError as e: print(ps.stem(review))I seem to get the numerous lines of the same review. I will lookf at the source file again and give feedback. The output is;
wife visit johannesburg famili function stay locat citi never felt comfort secur stay african pride melros arch mani restaur importantli servic attent staff african pride provid except cannot rave enough stay accommod would use citi world class
wife visit johannesburg famili function stay locat citi never felt comfort secur stay african pride melros arch mani restaur importantli servic attent staff african pride provid except cannot rave enough stay accommod would use citi world class