Python Forum

Full Version: Bad input shape for SVC
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi everyone,

I am trying to test a support vector machine classifier on text data for a kernel I found. I found a kernel that uses a neural network on the data just fine but I cannot use a SVC. The link to the kernel is below:

https://www.kaggle.com/yufengdev/bbc-tex...gorization

The code for my SVC is

from sklearn.svm import SVC
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
k_fold = KFold(n_splits=5, shuffle=True, random_state=0)

clf = SVC()
scoring = 'accuracy'
score = cross_val_score(clf, x_train, y_train, cv=k_fold, n_jobs=1, scoring=scoring)
print(score)
The error I get is:

ValueError: bad input shape (1424, 5)

Does anyone know why I am getting this error and how I can resolve this problem?

Thanks
If we use cited notebook as a basis for building train and test datasets,
we can see that keras.utils.to_categorical(y_train, num_classes)
is used. to_categorical is one-hot-encoder, so it turns y_train
with shape = (xxx, 1) to y_train with shape = (xxx, 5) (5 categories?). However,
SVC expects that the shape will be shape=(xxx, 1) (all categories should be integers, e.g. 1, 2, 3, 4, 5).

So, remove this line
y_train = keras.utils.to_categorical(y_train, num_classes)
somewhere in your code and everything should work fine.
You need to ensure that y_train consist only of numbers.
I am not sure, but it is likely that SVC requires that.