![]() |
Newbie question how to find the coefficient for each variable - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Newbie question how to find the coefficient for each variable (/thread-6914.html) Pages:
1
2
|
Newbie question how to find the coefficient for each variable - zydjohn - Dec-13-2017 Hello: I had some code to do multiple variable linear regression using statsmodels, the following is my code: import numpy as np import statsmodels.api as sm import statsmodels.formula.api as smf import pandas as pd x0 = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] y = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21] def genList1(x, n, offset): list1 = [] if (n + offset) <= len(x): list1 = x[offset:(offset + n)] return(list1) x1 = genList1(x0, 20, 5) x2 = genList1(x0, 20, 4) x3 = genList1(x0, 20, 3) xy = [('Y', y), ('x1', x1), ('x2', x2), ('x3', x3)] df = pd.DataFrame.from_items(xy) model = smf.ols('y ~ x1 + x2 +x3', df).fit() print(model.summary()) print('Done')I can see the following results: OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 1.000 Model: OLS Adj. R-squared: 1.000 Method: Least Squares F-statistic: 1.204e+32 Date: Wed, 13 Dec 2017 Prob (F-statistic): 6.91e-279 Time: 18:20:25 Log-Likelihood: 646.36 No. Observations: 20 AIC: -1289. Df Residuals: 18 BIC: -1287. Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept -1.0000 4.74e-16 -2.11e+15 0.000 -1.000 -1.000 x1 -0.6667 4.46e-16 -1.5e+15 0.000 -0.667 -0.667 x2 0.3333 3.04e-17 1.1e+16 0.000 0.333 0.333 x3 1.3333 5.02e-16 2.65e+15 0.000 1.333 1.333 ============================================================================== Omnibus: 1.008 Durbin-Watson: 0.381 Prob(Omnibus): 0.604 Jarque-Bera (JB): 0.784 Skew: 0.452 Prob(JB): 0.676 Kurtosis: 2.645 Cond. No. 6.56e+16 ============================================================================== But I want to use the coefficient for each variable, for example, the coef for x1 (-0.6667), coef for x2 (0.3333), coef for x3 (1.3333) and Intercept (-1.0) But I can't find any useful document on how to extract each coefficient and the intercept for the linear regression model. Please advise, Thanks, RE: Newbie question how to find the coefficient for each variable - j.crater - Dec-13-2017 Hey, I'm not sure if it's of much help, but check if this can give you what you need: http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html#statsmodels.regression.linear_model.RegressionResults RE: Newbie question how to find the coefficient for each variable - zydjohn - Dec-13-2017 Hello, Thanks for your help, however, after I read the web page listed, still there is no example how to get results I want. I even found the source code for: Source code for statsmodels.regression.linear_model Here is the link: http://www.statsmodels.org/dev/_modules/statsmodels/regression/linear_model.html But I am a newbie, I can't read those complicated python code. Who can help to find how to fetch the coefficient for each variable, i.e. just print out the values for those coefficients for x1, x2, x3 and intercept. Thanks, RE: Newbie question how to find the coefficient for each variable - j.crater - Dec-13-2017 Hello, You will need to get familiar with basic concepts of object oriented programming in Python to make good use of statsmodels module. After a quick glance, I suggest trying model.params to get parameter values. That is after you have done fit().
RE: Newbie question how to find the coefficient for each variable - Larz60+ - Dec-13-2017 See: http://www.statsmodels.org/stable/regression.html#module-statsmodels.regression.linear_model in the example here, they list the coefficients in the res.summary if you download the source: https://pypi.python.org/packages/72/16/d7e7a70fc8ca3cc0d783a66e902a7adf80a810695c357cd48bb22c82451a/statsmodels-0.8.0.tar.gz#md5=b3e5911cc9b00b71228d5d39a880bba0 you should be able to locate the OLS class, summary method and see how the coefficients are printed. Actually there's a link to the detail docs on that page, so maybe you don't have to download RE: Newbie question how to find the coefficient for each variable - j.crater - Dec-14-2017 From what I understood the OP wants to access parameter values in form of a variable that can use in the program. He already used summary in the code he posted, but it just gives a printout of the values. RE: Newbie question how to find the coefficient for each variable - Larz60+ - Dec-14-2017 After the command: model = smf.ols('y ~ x1 + x2 +x3', df).fit()gets run, the link under Model classes (that I provided) contains a link to ols: http://www.statsmodels.org/stable/generated/statsmodels.regression.linear_model.OLS.html#statsmodels.regression.linear_model.OLS that page shows that: >>> results.tvalues array([ 1.87867287, 0.98019606]) >>> print(results.t_test([1, 0])) <T test: effect=array([ 2.14285714]), sd=array([[ 1.14062282]]), t=array([[ 1.87867287]]), p=array([[ 0.05953974]]), df_denom=5>contains the coefficients as well as other information. I believe this is correct RE: Newbie question how to find the coefficient for each variable - j.crater - Dec-14-2017 This seems to be output for T-values (there are also P-values) which is one of statistics involved in linear regression models. But it is not the regression function coefficients. RE: Newbie question how to find the coefficient for each variable - Larz60+ - Dec-14-2017 import numpy as np import statsmodels.api as sm import statsmodels.formula.api as smf import pandas as pd x0 = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] y = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21] def genList1(x, n, offset): list1 = [] if (n + offset) <= len(x): list1 = x[offset:(offset + n)] return(list1) x1 = genList1(x0, 20, 5) x2 = genList1(x0, 20, 4) x3 = genList1(x0, 20, 3) xy = [('Y', y), ('x1', x1), ('x2', x2), ('x3', x3)] df = pd.DataFrame.from_items(xy) model = smf.ols('y ~ x1 + x2 +x3', df).fit() print(model.summary()) print(f'model.params: {model.params}') print('Done')model.params returns:
RE: Newbie question how to find the coefficient for each variable - zydjohn - Dec-14-2017 model.params is the correct answer. Thanks, |