Feb-20-2019, 01:44 PM
Dear all,
I am working on a pandas dataframe
Thank you
I am working on a pandas dataframe
df
and I would like to implement a K-Neighbors Classifier with sklearn. I have three variables that would make one parameter and one variable for the labels. I generated 3 arrays from df with numpy, and concatenated them into a matrixA = np.array(df.smoke_code) B = np.array(df.drug_code) C = np.array(df.drink_code) mat = np.column_stack((A,B,C)) mat Out[77]: array([[0.25 , 0. , 0.4 ], [0. , 0.33333333, 0.6 ], [0. , 0. , 0.4 ], …, [0. , 0.33333333, 0.6 ], [0. , 0. , 0. ], [0.5 , 1. , 0.4 ]]The matrix has 3 columns as expected. I then created and fed the classifier as follows:
x = mat y = pd.factorize(new_df[‘label_code’].values)[0].reshape(-1, 1) # label x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42) model = KNeighborsClassifier(n_neighbors = 12) model.fit(x_train, y_train) prediction = model.predict_proba(x_test) print(accuracy_score(y_test, prediction)) main:11: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). Traceback (most recent call last): File “”, line 13, in print(accuracy_score(y_test, prediction)) File “/home/gigiux/.local/lib/python3.6/site-packages/sklearn/metrics/classification.py”, line 176, in accuracy_score y_type, y_true, y_pred = _check_targets(y_true, y_pred) File “/home/gigiux/.local/lib/python3.6/site-packages/sklearn/metrics/classification.py”, line 81, in _check_targets “and {1} targets”.format(type_true, type_pred)) ValueError: Classification metrics can’t handle a mix of multiclass and continuous-multioutput targetsI don’t understand how I am getting the last error. How can I solve this?
Thank you