Jan-31-2020, 04:46 PM
Hello Jef,
Thanks again for your input. Ok, I have made some changes to my code:
Thank you!
Thanks again for your input. Ok, I have made some changes to my code:
from sklearn.ensemble import ExtraTreesClassifier model = ExtraTreesClassifier() model.fit(x_train, y_train) coef = pd.DataFrame({''Columns'': x_train.columns, ''Importances'': np.transpose(model.feature_importances_)}).sort_values(by=[''Importances''], ascending=False) print(coef.nlargest(10, ''Importances''))I am getting the following output:
Output: Columns Importances
125 incident_severity_Minor Damage 0.042847
40 insured_hobbies_chess 0.041505
126 incident_severity_Total Loss 0.028544
124 collision_type_Unknown 0.019634
41 insured_hobbies_cross-fit 0.014173
1 policy_state_OH 0.009765
16 insured_sex_MALE 0.009697
57 insured_relationship_own-child 0.009582
25 insured_occupation_exec-managerial 0.009513
5 policy_deductable_500 0.009146
I can't make sense of this, as the percentages don't seem right? Need they be calibrated or converted?Thank you!