Python Forum

Full Version: Predicting an output variable with sklearn
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
*Disclaimer: I am new to this forum so I apologize if I am posting this in a way that does not conform to the best practices. This is a relatively general question but I appreciate any help that is subsequently provided.*

I am new to python and I am not sure how to accomplish my objective.

I have a relatively small data set that I was given for research to fit a model that will accurately predict a continuous value based on two continuous inputs. It does not seem to be a complicated task, but I am unsure how to proceed.

I attempted to run a LinearRegression model but found that this is better suited for categorical data. This is the head from my data set to give you an idea of the type of data that I am attempting to process.

[Image: open?id=1HP4mWEBCpbgBXHSvP2IThj4T9DRWDbzW]
Thanks in advance for any advice / help going forward.

import pandas as pd
LSdata=pd.read_excel('/Users/connercross/Desktop/LSTimePredict.xlsx')
LSdata.head()
Thickness	Length	L S Time
0	0.25	30	1.0
1	0.25	60	1.0
2	0.25	66	1.0
3	0.25	72	1.0
4	0.25	84	1.5
from sklearn.model_selection import train_test_split
x_vars=LSdata.drop('L S Time', axis=1)
y_var=LSdata['L S Time']
xTrain,xValid,yTrain,yValid=train_test_split(x_vars, y_var, train_size=.6, random_state=2)
/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_split.py:2179: FutureWarning: From version 0.21, test_size will always complement train_size unless both are specified.
  FutureWarning)
from sklearn.linear_model import LogisticRegression
logmod=LogisticRegression()
logmod.fit(xTrain, yTrain)
/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:433: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
  FutureWarning)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-12-2c6782c908b0> in <module>
----> 1 logmod.fit(xTrain, yTrain)

/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py in fit(self, X, y, sample_weight)
   1284         X, y = check_X_y(X, y, accept_sparse='csr', dtype=_dtype, order="C",
   1285                          accept_large_sparse=solver != 'liblinear')
-> 1286         check_classification_targets(y)
   1287         self.classes_ = np.unique(y)
   1288         n_samples, n_features = X.shape

/anaconda3/lib/python3.7/site-packages/sklearn/utils/multiclass.py in check_classification_targets(y)
    169     if y_type not in ['binary', 'multiclass', 'multiclass-multioutput',
    170                       'multilabel-indicator', 'multilabel-sequences']:
--> 171         raise ValueError("Unknown label type: %r" % y_type)
    172 
    173 

ValueError: Unknown label type: 'continuous'
Hi, could you upload the excel file please?