Printing effect sizes for variables in an anova - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Printing effect sizes for variables in an anova (/thread-39334.html) |
Printing effect sizes for variables in an anova - eyavuz21 - Feb-01-2023 Hey all, I have the following code. resultmodeldistancevariation2sleep = smf.ols(formula='weighteddistance ~ age + C(gender) + C(highest_education_level_acheived)',data=x).fit() resultmodeldistancevariation2sleep.summary()I am trying to print the output of a linear model, using smf.ols. However, when I print the output below, I see that for the categorical variables, one of the categories within each categorical variable is used as a baseline . So for example it uses group 1.0 for gender as a baseline. The other categories within that categorical variable (gender[T.2.0] and gender[T.3.0] are then compared to that baseline category). coef std err t P>|t| [0.025 0.975] Intercept -0.6726 0.220 -3.058 0.002 -1.104 -0.241 C(gender)[T.2.0] 0.1905 0.050 3.822 0.000 0.093 0.288 C(gender)[T.3.0] 0.2810 0.174 1.619 0.106 -0.060 0.622 C(highest_education_level_acheived)[T.3] 0.0115 0.208 0.056 0.956 -0.397 0.420 C(highest_education_level_acheived)[T.4] -0.0295 0.214 -0.138 0.890 -0.449 0.390 C(highest_education_level_acheived)[T.5] 0.0912 0.207 0.439 0.660 -0.316 0.499 C(highest_education_level_acheived)[T.6] 0.2657 0.219 1.216 0.224 -0.163 0.695 C(highest_education_level_acheived)[T.7] 0.3885 0.253 1.539 0.124 -0.107 0.884 age 0.0150 0.003 4.716 0.000 0.009 0.02However, I want to see the effect of the categorical variable as a whole and not each category within that variable. I thus place the smf.ols model output into an anova using 'anova_lm': anovaoutput = sm.stats.anova_lm(resultmodeldistancevariation2sleep) anovaoutput['PR(>F)'] = anovaoutput['PR(>F)'].round(4) df sum_sq mean_sq F PR(>F) C(gender) 2.0 4.227966 2.113983 5.681874 0.0036 C(highest_education_level_acheived) 5.0 11.425706 2.285141 6.141906 0.0000 age 1.0 8.274317 8.274317 22.239357 0.0000 Residual 647.0 240.721120 0.372057 NaN NaNHowever, neither the confidence intervals or the coefficients are printed in this output. How can I amend my code to print these values as part of the anova_lm output? Is there also a way I can calculate the partial eta squared? Would be so grateful for a helping hand! The dataframe x looks like this (first 5 rows): age gender highest_education_level_acheived hours_of_phone_use_per_week weight height drink_alcohol_yes_no drink_caffeine_yes_no growupenvironment sunlight frequency_of_naps weighteddistance 0 24.0 1.0 4 13.0 201.0 69.0 1 1 1.0 0.666667 2 -0.423448 1 33.0 1.0 3 10.0 140.0 68.0 2 2 2.0 0.500000 3 -0.375761 3 34.0 1.0 3 5.0 170.0 72.0 1 2 2.0 0.166667 3 -0.197738 4 32.0 1.0 4 1.0 205.0 69.0 1 1 2.0 1.000000 1 -0.767542 7 23.0 1.0 5 5.0 180.0 72.0 1 1 2.0 0.333333 1 0.190099 RE: Printing effect sizes for variables in an anova - Larz60+ - Feb-01-2023 to suggest ammending you code, we should first be able to examine your code. You haven't provided that. There are examples: All examples Ordinary Least Squares There is also scipy see sklearn linear model RE: Printing effect sizes for variables in an anova - eyavuz21 - Feb-01-2023 (Feb-01-2023, 11:14 AM)Larz60+ Wrote: to suggest ammending you code, we should first be able to examine your code. I have provided the first 5 lines of the dataframe 'x', which I think should be enough to understand my issue. Thank you for the suggestions with those links but sadly my issue has not yet been solved. Is that enough information for you? :) Let me know what else to provide! |