Python Forum
Thread Rating:
  • 1 Vote(s) - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Bag of words question
#2
from sklearn.feature_extraction.text import CountVectorizer

sentences = ['dont iterate over rows of dataframe',
             'try to use dataframe indexing']

vec = CountVectorizer()
vectors = vec.fit_transform(sentences).toarray()

print(sorted(((v, k) for k,v in vec.vocabulary_.items())))
print(vectors[0])
print(vectors[1])
Output:
[(0, 'dataframe'), (1, 'dont'), (2, 'indexing'), (3, 'iterate'), (4, 'of'), (5, 'over'), (6, 'rows'), (7, 'to'), (8, 'try'), (9, 'use')] [1 1 0 1 1 1 1 0 0 0] [1 0 1 0 0 0 0 1 1 1]
Reply


Messages In This Thread
Bag of words question - by fancy_panther - Mar-23-2017, 10:32 AM
RE: Bag of words question - by zivoni - Mar-23-2017, 06:26 PM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020