Mar-24-2020, 11:34 AM
Hi
I have the code below that loops through every combination of columns in DF to create a subset of regression models and returns the best one. The code does not throw up any errors until I run the last line: best_subset(X, Y). It returns the following error : "IndexingError: Too many indexers".
Does anyone have an idea why it is not working?
Bitten by Python
I have the code below that loops through every combination of columns in DF to create a subset of regression models and returns the best one. The code does not throw up any errors until I run the last line: best_subset(X, Y). It returns the following error : "IndexingError: Too many indexers".
Does anyone have an idea why it is not working?
import numpy as np import pandas as pd import urllib from itertools import chain, combinations import statsmodels.api as sm #Data Rawdata = pd.read_csv("C:\\Users\\Yell\Documents\\datafilePython.csv") #Regression code def best_subset(X, Y): n_features = X.shape[1] subsets = chain.from_iterable(combinations(range(n_features), k+1) for k in range(n_features)) best_score = -np.inf best_subset = None for subset in subsets: lin_reg = sm.OLS(Y, X.iloc[:, subset]).fit() score = lin_reg.rsquared_adj if score > best_score: best_score, best_subset = score, subset return best_subset, best_score #Define variables X = Rawdata.iloc[:, 1:10] y = Rawdata.iloc[:, 0] #Run best_subset(X, Y)Thanks,
Bitten by Python