Python Forum

I'm a beginner of machine learning and now learning scikit-learn using iris analysis.

My code is here...

import pandas as pd
from sklearn import svm, metrics
from sklearn.model_selection import train_test_split

csv = pd.read_csv('iris.csv')

csv_data = csv[["SepalLength", "SepalWidth", "PetalLength", "PetalWidth"]]
csv_label = csv["Name"]

train_data, test_data, train_label, test_label = train_test_split(csv_data, csv_label)

clf = svm.SVC(gamma = 'auto')
clf.fit(train_data, train_label)
pre = clf.predict(test_data)

But at the end it said

Error:Traceback (most recent call last):
  File "iris-train2.py", line 9, in <module>
    csv_data = csv[["SepalLength", "SepalWidth", "PetalLength", "PetalWidth"]]
  File "C:\Users\KarinSugiura\Desktop\MachineLearning\learning\lib\site-packages\pandas\core\frame.py", line 2934, in __getitem__
    raise_missing=True)
  File "C:\Users\My name\lib\site-packages\pandas\core\indexing.py", line 1354, in _convert_to_indexer
    return self._get_listlike_indexer(obj, axis, **kwargs)[1]
  File "C:\Users\My name\lib\site-packages\pandas\core\indexing.py", line 1161, in _get_listlike_indexer
    raise_missing=raise_missing)
  File "C:\Users\My name\lib\site-packages\pandas\core\indexing.py", line 1246, in _validate_read_indexer
    key=key, axis=self.obj._get_axis_name(axis)))
KeyError: "None of [Index(['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth'], dtype='object')] are in the [columns]"

anyone can solve it??

To solve this, we need to know content of the file "iris.csv". It seems that it hasn't been properly loaded or has column names
differ from those you are used in line #7.
It would be better, if you passed numpy arrays to scikit-learn classifiers (not data frames), e.g.

X = csv[[...column names go here...]].values # gets numpy array
y = csv["grouping variable name goes here"].values # gets numpy array

# you need to encode grouping variable y, e.g. using LabelEncoder from scikit-learn.
# you probably need to do column-wise scaling of your data,
# e.g. using StandardScaler or Pandas facilities. 

# do splitting, training and testing with X, y and scikit-learn

To solve this, we need to know content of the file "iris.csv". It seems that it hasn't been properly loaded or has column names
differ from those you are used in line #7.

Thanks!! I didn't realize the difference of the name. Now I got it.
Sorry, it was too easy question...but great thanks!

Karin

scidam

Karin