I am trying to do CV for my training and testing datasets. I am using LinearRegressor. However, when I run the code, I get the error below. But when I run the code on Decision Trees I don't get any errors and the code works. How to fix this? Is my code for the CV section correct? Thank you for your help.......................................................
`X_normalized, y_for_normalized = scaled_df[[ "Part's Z-Height (mm)","Part's Solid Volume (cm^3)","Layer Height (mm)","Printing/Scanning Speed (mm/s)","Part's Orientation (Support's volume) (cm^3)"]], scaled_df [["Climate change (kg CO2 eq.)","Climate change, incl biogenic carbon (kg CO2 eq.)","Fine Particulate Matter Formation (kg PM2.5 eq.)","Fossil depletion (kg oil eq.)","Freshwater Consumption (m^3)","Freshwater ecotoxicity (kg 1,4-DB eq.)","Freshwater Eutrophication (kg P eq.)","Human toxicity, cancer (kg 1,4-DB eq.)","Human toxicity, non-cancer (kg 1,4-DB eq.)","Ionizing Radiation (Bq. C-60 eq. to air)","Land use (Annual crop eq. yr)","Marine ecotoxicity (kg 1,4-DB eq.)","Marine Eutrophication (kg N eq.)","Metal depletion (kg Cu eq.)","Photochemical Ozone Formation, Ecosystem (kg NOx eq.)","Photochemical Ozone Formation, Human Health (kg NOx eq.)","Stratospheric Ozone Depletion (kg CFC-11 eq.)","Terrestrial Acidification (kg SO2 eq.)","Terrestrial ecotoxicity (kg 1,4-DB eq.)"]].
`Part's Z-Height (mm) Part's Solid Volume (cm^3) Layer Height (mm) Printing/Scanning Speed (mm/s) Part's Orientation (Support's volume) (cm^3) Climate change (kg CO2 eq.) Climate change, incl biogenic carbon (kg CO2 eq.) Fine Particulate Matter Formation (kg PM2.5 eq.) Fossil depletion (kg oil eq.) Freshwater Consumption (m^3) Freshwater ecotoxicity (kg 1,4-DB eq.) Freshwater Eutrophication (kg P eq.) Human toxicity, cancer (kg 1,4-DB eq.) Human toxicity, non-cancer (kg 1,4-DB eq.) Ionizing Radiation (Bq. C-60 eq. to air) Land use (Annual crop eq. yr) Marine ecotoxicity (kg 1,4-DB eq.) Marine Eutrophication (kg N eq.) Metal depletion (kg Cu eq.) Photochemical Ozone Formation, Ecosystem (kg NOx eq.) Photochemical Ozone Formation, Human Health (kg NOx eq.) Stratospheric Ozone Depletion (kg CFC-11 eq.) Terrestrial Acidification (kg SO2 eq.) Terrestrial ecotoxicity (kg 1,4-DB eq.) 0 0.258287 0.005030 0.0 0.666667 0.040088 0.069825 0.056976 0.083205 0.010373 0.113808 0.104798 0.086400 0.110358 0.012836 0.091120 0.108676 0.090401 0.087426 0.125608 0.079028 0.080495 0.078380 0.082404 0.045040 1 0.258287 0.005030 0.2 0.666667 0.036597 0.041682 0.022880 0.074884 0.004841 0.045640 0.102285 0.082884 0.044202 0.005414 0.086700 0.105749 0.087161 0.084130 0.060373 0.072878 0.073529 0.074829 0.075438 0.018122 2 0.258287 0.009557 0.4 0.666667 0.031013 0.033310 0.012113 0.073035 0.003458 0.023401 0.102914 0.082494 0.022690 0.003231 0.086279 0.105749 0.086937 0.084130 0.039708 0.071341 0.071981 0.074698 0.073447 0.009856 3 0.258287 0.009054 0.6 0.666667 0.031013 0.029213 0.006954 0.072111 0.002766 0.012936 0.102914 0.082103 0.012524 0.001921 0.086069 0.105423 0.086602 0.084130 0.029579 0.070572 0.071207 0.074435 0.072452 0.005723 4 0.258287 0.010060 1.0 0.666667 0.031711 0.025650 0.001795 0.071803 0.003458 0.002180 0.103542 0.082884 0.002063 0.001048 0.086490 0.106074 0.087049 0.084542 0.019449 0.070572 0.071207 0.074961 0.072452 0.001908 5 0.258287 0.005030 0.0 0.000000 0.040088 0.074279 0.062360 0.084129 0.011065 0.125000 0.104798 0.086790 0.121114 0.014146 0.091330 0.108676 0.091519 0.087426 0.136143 0.080566 0.081269 0.078511 0.083400 0.049385 6 0.258287 0.038226 0.0 0.666667 0.040088 0.097791 0.074249 0.109091 0.038036 0.135174 0.129299 0.111788 0.132164 0.024625 0.116582 0.133725 0.116102 0.112970 0.154781 0.105166 0.106037 0.104419 0.108280 0.064222 7 0.137212 0.004527 0.0 0.666667 0.030314 0.058247 0.046433 0.076117 0.003458 0.095349 0.099144 0.080150 0.092382 0.008907 0.084806 0.102821 0.084702 0.081246 0.106159 0.072878 0.073529 0.072199 0.075438 0.035608 8 0.137212 0.004527 0.2 0.666667 0.029616 0.035269 0.017721 0.069954 0.000000 0.037355 0.098516 0.078197 0.036246 0.002794 0.082281 0.101520 0.082803 0.080010 0.051053 0.068266 0.068885 0.070489 0.070462 0.013247 9 0.137212 0.010060 0.4 0.666667 0.028918 0.031706 0.010543 0.072111 0.002766 0.020494 0.102285 0.081712 0.019891 0.002358 0.085438 0.104773 0.086043 0.083306 0.036467 0.070572 0.071207 0.073908 0.072452 0.008372 10 0.137212 0.010060 0.6 0.666667 0.028220 0.027431 0.005384 0.070878 0.001383 0.010320 0.101657 0.080931 0.010019 0.001484 0.084806 0.104448 0.085373 0.082894 0.026742 0.069803 0.070433 0.073251 0.071457 0.004345 11 0.137212 0.009557 1.0 0.666667 0.027522 0.022800 0.000000 0.069029 0.000000 0.000000 0.101029 0.080150 0.000000 0.000000 0.083754 0.103472 0.084367 0.081658 0.016613 0.068266 0.068885 0.072330 0.070462 0.000000 12 0.137212 0.004527 0.0 0.000000 0.030314 0.062879 0.052266 0.077042 0.004149 0.107122 0.099144 0.080541 0.103875 0.010217 0.085227 0.102821 0.085037 0.081658 0.117099 0.073647 0.074303 0.072462 0.076433 0.040165 13 0.137212 0.037723 0.0 0.666667 0.030314 0.085857 0.063257 0.102003 0.031120 0.116134 0.123645 0.105929 0.112568 0.020695 0.110269 0.127544 0.110515 0.106790 0.134522 0.098247 0.099071 0.097843 0.101314 0.053624 14 0.077118 0.004527 0.0 0.666667 0.054050 0.080335 0.064827 0.091217 0.018672 0.126453 0.111709 0.093821 0.122145 0.016766 0.098485 0.115833 0.098223 0.094842 0.139789 0.087485 0.088235 0.085876 0.090366 0.052777 15 0.077118 0.004527 0.0 0.000000 0.054050 0.085144 0.070884 0.092450 0.019364 0.138081 0.111709 0.094211 0.133638 0.018075 0.099116 0.116158 0.098223 0.094842 0.151135 0.088253 0.089009 0.086139 0.091361 0.057864 16 0.077118 0.004527 0.0 0.333333 0.054050 0.082472 0.067519 0.091834 0.019364 0.132267 0.111709 0.094211 0.127744 0.017639 0.098695 0.116158 0.098223 0.094842 0.144652 0.087485 0.088235 0.086007 0.091361 0.054684
lin_regressor = LinearRegression() # pass the order of your polynomial here poly = PolynomialFeatures(1) # convert to be used further to linear regression X_transform = poly.fit_transform(x_train) # fit this to Linear Regressor linear_regg=lin_regressor.fit(X_transform,y_train).
import numpy as np from sklearn.metrics import SCORERS from sklearn.model_selection import KFold scorer = SCORERS['r2'] cv = KFold(n_splits=5, random_state=0,shuffle=True) train_scores, test_scores = [], [] for train, test in cv.split(X_normalized): X_transform2 = poly.fit_transform(X_normalized) OL=lin_regressor.fit(X_transform2.iloc[train], y_for_normalized.iloc[train]) tr_21 = OL.score(X_train, y_train) ts_21 = OL.score(X_test, y_test) print ("Train score:", tr_21) # from documentation .score returns r^2 print ("Test score:", ts_21) # from documentation .score returns r^2 train_scores.append(tr_21) test_scores.append(ts_21) print ("The Mean for Train scores is:",(np.mean(train_scores))) print ("The Mean for Test scores is:",(np.mean(test_scores)))
-------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /var/folders/mm/r4gnnwl948zclfyx12w803040000gn/T/ipykernel_73165/2276765730.py in <module> 10 for train, test in cv.split(X_normalized): 11 X_transform2 = poly.fit_transform(X_normalized) ---> 12 OL=lin_regressor.fit(X_transform2.iloc[train], y_for_normalized.iloc[train]) 13 tr_21 = OL.score(X_train, y_train) 14 ts_21 = OL.score(X_test, y_test) AttributeError: 'numpy.ndarray' object has no attribute 'iloc'However, the same code is working on Decision Trees!
new_model = DecisionTreeRegressor(max_depth=9, min_samples_split=10,random_state=0)
import numpy as np from sklearn.metrics import SCORERS from sklearn.model_selection import KFold scorer = SCORERS['r2'] cv = KFold(n_splits=5, random_state=0,shuffle=True) train_scores, test_scores = [], [] for train, test in cv.split(X_normalized): OO=new_model.fit(X_normalized.iloc[train], y_for_normalized.iloc[train]) tr_2 = OO.score(X_train, y_train) ts_2 = OO.score(X_test, y_test) print ("Train score:", tr_2) # from documentation .score returns r^2 print ("Test score:", ts_2) # from documentation .score returns r^2 train_scores.append(tr_2) test_scores.append(ts_2) print ("The Mean for Train scores is:",(np.mean(train_scores))) print ("The Mean for Test scores is:",(np.mean(test_scores)))