![]() |
Using Python and scikitlearn, how to output the individual feature dependencies? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Using Python and scikitlearn, how to output the individual feature dependencies? (/thread-24096.html) |
Using Python and scikitlearn, how to output the individual feature dependencies? - warren8r - Jan-30-2020 Hello, I am relatively new to Python and Machine Learning. I have a basic dataset for insurance fraud and a script that generates the model and runs the predictions. I am able to output the accuracy percentages, but I would like to also output the feature dependencies: For example, what role did each attribute play in the prediction? The policy_number would be 0.0% where as the claim_amount would likely be 56.2%, does this make sense? Is there a scikit function for this? Also, is "feature dependency" even the correct term? Thank you for your help! -Matt RE: Using Python and scikitlearn, how to output the individual feature dependencies? - jefsummers - Jan-30-2020 So in other words, you would like the coefficients of your model? Once you generate your regression by LR = model.fit(X,y) or similar, LR.coef_ is an array of the coefficients for each of the features. Take that and convert to percent of total and you will have what you are looking for. RE: Using Python and scikitlearn, how to output the individual feature dependencies? - warren8r - Jan-30-2020 Hello Jef, Yes, exactly! Thank you so much for this suggestion. So the proper terminology is "coefficients." Aside from LR, does this *.coef function work for any model? Thank you, again, for taking the time to help me. RE: Using Python and scikitlearn, how to output the individual feature dependencies? - jefsummers - Jan-30-2020 I hesitate to say yes to any or all, but in general that is true. Probably not for classification models but have not checked. The other term besides coefficients is "weights". I use coefficients for the equation, weights once you have converted to a percentage. Others can correct me if wrong RE: Using Python and scikitlearn, how to output the individual feature dependencies? - warren8r - Jan-31-2020 Hello Jef, Thanks again for your input. Ok, I have made some changes to my code: from sklearn.ensemble import ExtraTreesClassifier model = ExtraTreesClassifier() model.fit(x_train, y_train) coef = pd.DataFrame({''Columns'': x_train.columns, ''Importances'': np.transpose(model.feature_importances_)}).sort_values(by=[''Importances''], ascending=False) print(coef.nlargest(10, ''Importances''))I am getting the following output: I can't make sense of this, as the percentages don't seem right? Need they be calibrated or converted?Thank you! RE: Using Python and scikitlearn, how to output the individual feature dependencies? - jefsummers - Jan-31-2020 Sum the coefficients, then divide each coefficient by the sum and multiply by 100 to convert to a percent RE: Using Python and scikitlearn, how to output the individual feature dependencies? - piotrkuras - May-19-2021 Good Morning, I am a student at the University of Rzeszow. As part of my master's thesis, I am conducting a study on the use of data clustering methods. Please complete the survey found at the link https://forms.gle/tK8mdjbxaKeRAQpm7. The survey is anonymous and consists of 9 short questions. Thank you for your time. Piotr Kuras RE: Using Python and scikitlearn, how to output the individual feature dependencies? - jefsummers - May-19-2021 Not really telling you what to do, but a survey is usually to describe and/or predict behavior in a population. What population do you think you have posting here? For your thesis, how are you going to describe the eligible population that is surveyed? RE: Using Python and scikitlearn, how to output the individual feature dependencies? - Caprone - May-20-2021 I don't see the problem; that is your Gini importance feature ranking...of course you can tune your algorithm , but the logic is always the same |