Python Forum
RandomForest --ValueError: setting an array element with a sequence - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: RandomForest --ValueError: setting an array element with a sequence (/thread-34854.html)



RandomForest --ValueError: setting an array element with a sequence - JaneTan - Sep-08-2021

I am completely new to RandomForest and Machine Learning. Some help will be appreciated! Thank you!

Example of DataSet
**ID    |sentiment |  review                          | source   |**
'5'     |0         | lousy movie                      | twitter  |
'6'     |1         | excellent acting                 | website  |
'7'     |0         | bad script, but wonderful actors | feedback |
I create Bag-of-word (BOW) for review

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.ensemble import RandomForestClassifier

file_location = 'C:/Desktop/test.xlsx'
xlsx=pd.ExcelFile(file_location, engine='openpyxl')
df=xlsx.parse('Sheet1',header=0) 

bow=df['review']
Y_train=df['sentiment']

vect = CountVectorizer()
bow = vect.fit_transform(bow)
I created another df and added both BOW and Review as columns
Is this correct? Can I add the sparse matrix of bow into a df?
df1 = pd.DataFrame(bow)
df1['source']=df['source']

X_train=df1.values
print(X_train)
ouput of print(X_train)
[[<1x16 sparse matrix of type '<class 'numpy.int64'>'
        with 6 stored elements in Compressed Sparse Row format>
  'twitter']
 [<1x16 sparse matrix of type '<class 'numpy.int64'>'
        with 5 stored elements in Compressed Sparse Row format>
  'website']
 [<1x16 sparse matrix of type '<class 'numpy.int64'>'
        with 2 stored elements in Compressed Sparse Row format>
  'feedback']
Train the RandomForest Model

forest = RandomForestClassifier(n_estimators = 100) 
forest = forest.fit( X_train, Y_train)
Error

ValueError: setting an array element with a sequence