Python Forum
Newbie question how to find the coefficient for each variable - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Newbie question how to find the coefficient for each variable (/thread-6914.html)

Pages: 1 2


Newbie question how to find the coefficient for each variable - zydjohn - Dec-13-2017

Hello:
I had some code to do multiple variable linear regression using statsmodels, the following is my code:
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd

x0 = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]
y = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]

def genList1(x, n, offset):
    list1 = []
    if (n + offset) <= len(x):
       list1 = x[offset:(offset + n)]
    return(list1)

x1 = genList1(x0, 20, 5)
x2 = genList1(x0, 20, 4)
x3 = genList1(x0, 20, 3)

xy = [('Y', y), ('x1', x1), ('x2', x2), ('x3', x3)]
df = pd.DataFrame.from_items(xy)
model = smf.ols('y ~ x1 + x2 +x3', df).fit()
print(model.summary())
print('Done')
I can see the following results:
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
Method: Least Squares F-statistic: 1.204e+32
Date: Wed, 13 Dec 2017 Prob (F-statistic): 6.91e-279
Time: 18:20:25 Log-Likelihood: 646.36
No. Observations: 20 AIC: -1289.
Df Residuals: 18 BIC: -1287.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -1.0000 4.74e-16 -2.11e+15 0.000 -1.000 -1.000
x1 -0.6667 4.46e-16 -1.5e+15 0.000 -0.667 -0.667
x2 0.3333 3.04e-17 1.1e+16 0.000 0.333 0.333
x3 1.3333 5.02e-16 2.65e+15 0.000 1.333 1.333
==============================================================================
Omnibus: 1.008 Durbin-Watson: 0.381
Prob(Omnibus): 0.604 Jarque-Bera (JB): 0.784
Skew: 0.452 Prob(JB): 0.676
Kurtosis: 2.645 Cond. No. 6.56e+16
==============================================================================

But I want to use the coefficient for each variable, for example, the coef for x1 (-0.6667), coef for x2 (0.3333),
coef for x3 (1.3333) and Intercept (-1.0)
But I can't find any useful document on how to extract each coefficient and the intercept for the linear regression model.
Please advise,
Thanks,


RE: Newbie question how to find the coefficient for each variable - j.crater - Dec-13-2017

Hey, I'm not sure if it's of much help, but check if this can give you what you need:
http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html#statsmodels.regression.linear_model.RegressionResults


RE: Newbie question how to find the coefficient for each variable - zydjohn - Dec-13-2017

Hello,
Thanks for your help, however, after I read the web page listed, still there is no example how to get results I want.
I even found the source code for: Source code for statsmodels.regression.linear_model
Here is the link: http://www.statsmodels.org/dev/_modules/statsmodels/regression/linear_model.html
But I am a newbie, I can't read those complicated python code.
Who can help to find how to fetch the coefficient for each variable, i.e. just print out the values for those coefficients for x1, x2, x3 and intercept.
Thanks,


RE: Newbie question how to find the coefficient for each variable - j.crater - Dec-13-2017

Hello,
You will need to get familiar with basic concepts of object oriented programming in Python to make good use of statsmodels module.
After a quick glance, I suggest trying model.params to get parameter values. That is after you have done fit().


RE: Newbie question how to find the coefficient for each variable - Larz60+ - Dec-13-2017

See: http://www.statsmodels.org/stable/regression.html#module-statsmodels.regression.linear_model
in the example here, they list the coefficients in the res.summary

if you download the source: https://pypi.python.org/packages/72/16/d7e7a70fc8ca3cc0d783a66e902a7adf80a810695c357cd48bb22c82451a/statsmodels-0.8.0.tar.gz#md5=b3e5911cc9b00b71228d5d39a880bba0

you should be able to locate the OLS class, summary method and see how the coefficients are printed.

Actually there's a link to the detail docs on that page, so maybe you don't have to download


RE: Newbie question how to find the coefficient for each variable - j.crater - Dec-14-2017

From what I understood the OP wants to access parameter values in form of a variable that can use in the program. He already used summary in the code he posted, but it just gives a printout of the values.


RE: Newbie question how to find the coefficient for each variable - Larz60+ - Dec-14-2017

After the command:
model = smf.ols('y ~ x1 + x2 +x3', df).fit()
gets run, the link under Model classes (that I provided) contains a
link to ols: http://www.statsmodels.org/stable/generated/statsmodels.regression.linear_model.OLS.html#statsmodels.regression.linear_model.OLS
that page shows that:
>>> results.tvalues
array([ 1.87867287,  0.98019606])
>>> print(results.t_test([1, 0]))
<T test: effect=array([ 2.14285714]), sd=array([[ 1.14062282]]), t=array([[ 1.87867287]]), p=array([[ 0.05953974]]), df_denom=5>
contains the coefficients as well as other information.
I believe this is correct


RE: Newbie question how to find the coefficient for each variable - j.crater - Dec-14-2017

This seems to be output for T-values (there are also P-values) which is one of statistics involved in linear regression models. But it is not the regression function coefficients.


RE: Newbie question how to find the coefficient for each variable - Larz60+ - Dec-14-2017

import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd
 
x0 = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]
y = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
 
def genList1(x, n, offset):
    list1 = []
    if (n + offset) <= len(x):
       list1 = x[offset:(offset + n)]
    return(list1)
 
x1 = genList1(x0, 20, 5)
x2 = genList1(x0, 20, 4)
x3 = genList1(x0, 20, 3)
 
xy = [('Y', y), ('x1', x1), ('x2', x2), ('x3', x3)]
df = pd.DataFrame.from_items(xy)
model = smf.ols('y ~ x1 + x2 +x3', df).fit()
print(model.summary())
print(f'model.params: {model.params}')
print('Done')
model.params returns:
Output:
model.params: Intercept   -1.000000 x1             -0.666667 x2              0.333333 x3             1.333333 dtype: float64



RE: Newbie question how to find the coefficient for each variable - zydjohn - Dec-14-2017

model.params is the correct answer.
Thanks,