Impact of words from sentence on popularity
Hi, I have a table with few thousand of rows, in one column is a one or few sentences of joke, in second number of views. I would like to sort words which are mostly used in best jokes but not in bad one and reverse. How to do it in Python? Thank you
Please show what have you tried?
Yes, code please :-D
Sure, I have to say, that I'm stucked just at the start now I'm able to get only whole sentences, and their occurences.
df = pd.read_csv('C:/Users/Adam/School/6th semester/project/dataset.csv')
df_test = df.query('text == "test sentence"')
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2827 entries, 0 to 2826
Data columns (total 2 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   File           2827 non-null  object 
 1   text	        2827 non-null  object 
 3   views          2827 non-null  int64
I make some progress!

oneList = list(df['text'])
oneString = ' '.join(oneList)
allWords = oneString.lower().split()
count = Counter(allWords)
now I'm able to get frequency of words, but unfortunately without the influence of popularity :/

Any ideas how to distinguish between popular and unpopular? Thanks!
I think you will need NLTK's frequency distribution functions or something similar.

