Regression with pipeline and GridSearch - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Regression with pipeline and GridSearch (/thread-28713.html) |
Regression with pipeline and GridSearch - patite - Jul-31-2020 Hello I am implementing a pipeline with GridSearch I'm using the Boston housing dataset Here is my code X, y = load_boston(return_X_y=True) poly_params = {"degree": 2, "interaction_only": False, "include_bias": True } # pre-instantiation ridge_shrinkage = np.linspace(0.00001, 0.4, num=200) df_metrics = pd.DataFrame(index=[0], columns=["Fold", "Shrinkage", "Metric", "Train", "Test"]) # main loop f = 0 for (train, test) in rkf.split(X): f += 1 print(f) # separate variables and folds x_train = X.values[train] x_test = X.values[test] y_train = y.values[train] y_test = y.values[test] # fit model model_ridge = make_pipeline(StandardScaler(), PolynomialFeatures(**poly_params), Ridge()) # poly-params has been defined above on line 5 model_lasso = make_pipeline(StandardScaler(), PolynomialFeatures(**poly_params), Lasso()) model_SVR = make_pipeline(StandardScaler(), SVR()) ## List of pipelines pipelines = [model_ridge, model_lasso, model_SVR] pipe_dict = {1: 'Ridge', 2: 'Lasso', 3: 'SVR'} # Apply the fit method to the pipelines for pipe in pipelines: # pipe can be replaced by any other word pipe.fit(X_train, y_train) pipe.predict(x_train) pipe.predict(x_test) for i,model in enumerate(pipelines): print('Model score:{}'.format(pipe_dict[best_model])) #I am not sure whether this specification would work. parameters = [ {'model-ridge__alpha': np.arange(0, 0.5, 0.01) }, {'model-lasso__alpha': np.arange(0, 0.5, 0.01) }, {'model-SVR__' 'C': [0.1, 1, 100, 1000], 'epsilon': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10], 'gamma': [0.0001, 0.001, 0.005, 0.1, 1, 3, 5] }] scoring_func = make_scorer(mean_squared_error) # I would like to have the best model for each model in the pipelines grid_search = GridSearchCV(estimator = pipe, param_grid = parameters, scoring = scoring_func, cv = 10, n_jobs = -1) best_params = grid_result.best_params_ best_svr = SVR(kernel='rbf', C=best_params["C"], epsilon=best_params["epsilon"], gamma=best_params["gamma"], coef0=0.1, shrinking=True, tol=0.001, cache_size=200, verbose=False, max_iter=-1) grid_search = grid_search.fit(X_train, y_train)I don't know how to get the best model for each element of the pipelines. Thank you for your help! First error message: File "<tokenize>", line 58 grid_search = GridSearchCV(estimator = pipe, ^ IndentationError: unindent does not match any outer indentation level |