Hi, I have a table with few thousand of rows, in one column is a one or few sentences of joke, in second number of views. I would like to sort words which are mostly used in best jokes but not in bad one and reverse. How to do it in Python? Thank you
Quote:How to do it in Python?
Please show what have you tried?
Sure, I have to say, that I'm stucked just at the start now I'm able to get only whole sentences, and their occurences.
df = pd.read_csv('C:/Users/Adam/School/6th semester/project/dataset.csv')
df['text'].value_counts()
df_test = df.query('text == "test sentence"')
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2827 entries, 0 to 2826
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 File 2827 non-null object
1 text 2827 non-null object
3 views 2827 non-null int64
I make some progress!
oneList = list(df['text'])
oneString = ' '.join(oneList)
allWords = oneString.lower().split()
count = Counter(allWords)
print(count)
now I'm able to get frequency of words, but unfortunately without the influence of popularity :/
Any ideas how to distinguish between popular and unpopular? Thanks!