Evaluating the Performance of Machine Learning Algorithms

FelixLarry · Sep-02-2022, 09:20 PM

Hello Comrades, as someone who is very new to python, I keep learning and always trying new stuff each day. Here is a code I have been able to put together in my attempt to evaluating the performance of a machine learning algorithm using the resampling approach. Please lend me some few minutes of your precious time to help review it and improve on it for me. Very much thanks in advance.

# Evaluate using a train and a test set: Method 1
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
filename = 'pima-indians-diabetes.data.csv'
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pd.read_csv(filename, names=names)
array = dataframe.values
# Separate data into X and Y components
X = array[:,0:8]
Y = array[:,8]
# Splitting data into train and test
test_size = 0.33
seed = 7
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
model = LogisticRegression(solver='newton-cg')
model.fit(X_train, Y_train)
result = model.score(X_test, Y_test)
print('Accuracy: %.3f%%' % (result * 100.0))

# Evaluate using K-fold Cross Validation: Method 2
import pandas as pd
import numpy as np
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
filename = 'pima-indians-diabetes.data.csv'
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pd.read_csv(filename, names=names)
array = dataframe.values
# Separate the data into X and Y components
X = array[:,0:8]
y = array[:,8]
# Setting up the validation parameters
num_folds = 10
seed = 7
kfold = KFold(n_splits=num_folds, shuffle=True, random_state=seed)
model = LogisticRegression(solver='newton-cg')
results = cross_val_score(model, X, y, cv=kfold)
print('Accuracy: %.3f%% (%.3f%%)' % (results.mean() * 100.0, results.std() *  100))

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Choosing the Best Machine Learning Model	FelixLarry	1	3,263	Dec-23-2022, 07:36 AM Last Post: praveencqr
	Automate Machine Learning Workflows with Pipelines	FelixLarry	0	2,287	Sep-06-2022, 09:37 PM Last Post: FelixLarry
	Compare Machine Learning Regression Algorithms Consistently	FelixLarry	0	2,496	Sep-06-2022, 09:25 PM Last Post: FelixLarry
	Module for creating kernels and convoluting images (Machine Learning)	dibsonthis	0	2,645	Dec-14-2017, 11:58 AM Last Post: dibsonthis

Evaluating the Performance of Machine Learning Algorithms

User Panel Messages

Announcements