Python Forum
Python Project - Parkinson's Detection
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python Project - Parkinson's Detection
#1
Hi, I am trying to build a machine learning model for Parkinsons dataset. I am having trouble in extracting the features from the dataset. I need help in extracting the right features and labels.

import numpy as np
import pandas as pd
import os, sys
from sklearn.preprocessing import StandardScaler
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

#read parkinsons data
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/parkinsons/parkinsons.data'
data=pd.read_csv(url)
print(data.head())

#extract features and labels
features=data.loc[:,data.columns!='status'].values
labels=data.loc[:,'status'].values

#Scale the features 
scaler=StandardScaler()
features=scaler.fit_transform(features)

print(features.shape)
#Splitting the dataset
x_train,x_test,y_train,y_test=train_test_split(features, labels, test_size=0.2, random_state=7)

model=XGBClassifier()
model.fit(x_train,y_train)

y_pred=model.predict(x_test)
print(accuracy_score(y_test, y_pred)*100)
I am getting such error-

Error:
Traceback (most recent call last): File "D:\practise\parkinsons detection\detect_parkinson.py", line 20, in <module> features=scaler.fit_transform(features) File "C:\Users\Asus4\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\base.py", line 553, in fit_transform return self.fit(X, **fit_params).transform(X) File "C:\Users\Asus4\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\preprocessing\data.py", line 639, in fit return self.partial_fit(X, y) File "C:\Users\Asus4\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\preprocessing\data.py", line 663, in partial_fit force_all_finite='allow-nan') File "C:\Users\Asus4\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\validation.py", line 496, in check_array array = np.asarray(array, dtype=dtype, order=order) File "C:\Users\Asus4\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\core\numeric.py", line 538, in asarray return array(a, dtype, copy=False, order=order) ValueError: could not convert string to float: 'phon_R01_S01_1'
Reply
#2
By the error message, features contains some non-numeric text which throws the exception at line 20. I suggest you print features in the line before that to see what it contains.
Reply
#3
You are getting this error because your dataset contains a name, which is of string type.
In this case, the name is not a useful feature to make predictions. So, we need to exclude the first column from our features dataset.

Use this instead :
features=data.loc[:,data.columns!='status'].values[:,1:]
Which means we need all the rows starting from 0 to the end and column starting from 1st index to the end.

The accuracy of the model is 94.87 %
During my research, I found one of the python projects which is quite similar to this you must go through Python Project-Detecting Parkinson's Disease

Corrected Code :
import numpy as np
import pandas as pd
import os, sys
from sklearn.preprocessing import StandardScaler
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

#read parkinsons data
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/parkinsons/parkinsons.data'
data=pd.read_csv(url)
print(data.head())

#extract features and labels
features=data.loc[:,data.columns!='status'].values[:,1:]
labels=data.loc[:,'status'].values

#Scale the features 
scaler=StandardScaler()
features=scaler.fit_transform(features)

print(features.shape)
#Splitting the dataset
x_train,x_test,y_train,y_test=train_test_split(features, labels, test_size=0.2, random_state=7)

model=XGBClassifier()
model.fit(x_train,y_train)

y_pred=model.predict(x_test)
print(accuracy_score(y_test, y_pred)*100)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Learning pixel change detection in Python Hallucin88 2 7,527 Nov-07-2018, 08:46 PM
Last Post: Hallucin88

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020