Regression with pipeline and GridSearch

Regression with pipeline and GridSearch - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Regression with pipeline and GridSearch (/thread-28713.html)

Regression with pipeline and GridSearch - patite - Jul-31-2020

Hello
I am implementing a pipeline with GridSearch
I'm using the Boston housing dataset
Here is my code

X, y = load_boston(return_X_y=True)
poly_params = {"degree": 2,
               "interaction_only": False,
               "include_bias": True
              }
# pre-instantiation
ridge_shrinkage = np.linspace(0.00001, 0.4, num=200)

df_metrics = pd.DataFrame(index=[0], columns=["Fold", "Shrinkage", "Metric", "Train", "Test"])

# main loop
f = 0
for (train, test) in rkf.split(X):
    f += 1
    print(f)
    # separate variables and folds
    x_train = X.values[train]
    x_test = X.values[test]
    
    y_train = y.values[train]
    y_test = y.values[test]
    
   
    # fit model
    model_ridge =  make_pipeline(StandardScaler(), PolynomialFeatures(**poly_params), Ridge()) # poly-params has been defined above on line 5
    model_lasso =  make_pipeline(StandardScaler(), PolynomialFeatures(**poly_params), Lasso())
    model_SVR =  make_pipeline(StandardScaler(), SVR())
    
## List of pipelines
pipelines = [model_ridge, model_lasso, model_SVR]
           
pipe_dict = {1: 'Ridge', 2: 'Lasso', 3: 'SVR'}

    # Apply the fit method to the pipelines
    for pipe in pipelines:                         # pipe can be replaced by any other word
        pipe.fit(X_train, y_train)
        pipe.predict(x_train) 
        pipe.predict(x_test)
       
    for i,model in enumerate(pipelines):
          print('Model score:{}'.format(pipe_dict[best_model]))
                               
    #I am not sure whether this specification would work.
    parameters = [ {'model-ridge__alpha': np.arange(0, 0.5, 0.01) },
                   {'model-lasso__alpha': np.arange(0, 0.5, 0.01) },
              {'model-SVR__'
            'C': [0.1, 1, 100, 1000],
            'epsilon': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10],
            'gamma': [0.0001, 0.001, 0.005, 0.1, 1, 3, 5]
              }]
       scoring_func = make_scorer(mean_squared_error)
    
   # I would like to have the best model for each model in the pipelines                            
   grid_search = GridSearchCV(estimator = pipe, 
               param_grid = parameters,
               scoring = scoring_func,
               cv = 10,
               n_jobs = -1)
best_params = grid_result.best_params_
best_svr = SVR(kernel='rbf', C=best_params["C"], epsilon=best_params["epsilon"], gamma=best_params["gamma"],
                   coef0=0.1, shrinking=True,
                   tol=0.001, cache_size=200, verbose=False, max_iter=-1)
grid_search = grid_search.fit(X_train, y_train)

I don't know how to get the best model for each element of the pipelines. Thank you for your help!
First error message:
File "<tokenize>", line 58
grid_search = GridSearchCV(estimator = pipe,
^
IndentationError: unindent does not match any outer indentation level