Python Forum
Newbie question how to find the coefficient for each variable
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Newbie question how to find the coefficient for each variable
#1
Hello:
I had some code to do multiple variable linear regression using statsmodels, the following is my code:
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd

x0 = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]
y = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]

def genList1(x, n, offset):
    list1 = []
    if (n + offset) <= len(x):
       list1 = x[offset:(offset + n)]
    return(list1)

x1 = genList1(x0, 20, 5)
x2 = genList1(x0, 20, 4)
x3 = genList1(x0, 20, 3)

xy = [('Y', y), ('x1', x1), ('x2', x2), ('x3', x3)]
df = pd.DataFrame.from_items(xy)
model = smf.ols('y ~ x1 + x2 +x3', df).fit()
print(model.summary())
print('Done')
I can see the following results:
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
Method: Least Squares F-statistic: 1.204e+32
Date: Wed, 13 Dec 2017 Prob (F-statistic): 6.91e-279
Time: 18:20:25 Log-Likelihood: 646.36
No. Observations: 20 AIC: -1289.
Df Residuals: 18 BIC: -1287.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -1.0000 4.74e-16 -2.11e+15 0.000 -1.000 -1.000
x1 -0.6667 4.46e-16 -1.5e+15 0.000 -0.667 -0.667
x2 0.3333 3.04e-17 1.1e+16 0.000 0.333 0.333
x3 1.3333 5.02e-16 2.65e+15 0.000 1.333 1.333
==============================================================================
Omnibus: 1.008 Durbin-Watson: 0.381
Prob(Omnibus): 0.604 Jarque-Bera (JB): 0.784
Skew: 0.452 Prob(JB): 0.676
Kurtosis: 2.645 Cond. No. 6.56e+16
==============================================================================

But I want to use the coefficient for each variable, for example, the coef for x1 (-0.6667), coef for x2 (0.3333),
coef for x3 (1.3333) and Intercept (-1.0)
But I can't find any useful document on how to extract each coefficient and the intercept for the linear regression model.
Please advise,
Thanks,
Reply
#2
Hey, I'm not sure if it's of much help, but check if this can give you what you need:
http://www.statsmodels.org/dev/generated...ionResults
Reply
#3
Hello,
Thanks for your help, however, after I read the web page listed, still there is no example how to get results I want.
I even found the source code for: Source code for statsmodels.regression.linear_model
Here is the link: http://www.statsmodels.org/dev/_modules/...model.html
But I am a newbie, I can't read those complicated python code.
Who can help to find how to fetch the coefficient for each variable, i.e. just print out the values for those coefficients for x1, x2, x3 and intercept.
Thanks,
Reply
#4
Hello,
You will need to get familiar with basic concepts of object oriented programming in Python to make good use of statsmodels module.
After a quick glance, I suggest trying model.params to get parameter values. That is after you have done fit().
Reply
#5
See: http://www.statsmodels.org/stable/regres...near_model
in the example here, they list the coefficients in the res.summary

if you download the source: https://pypi.python.org/packages/72/16/d...39a880bba0

you should be able to locate the OLS class, summary method and see how the coefficients are printed.

Actually there's a link to the detail docs on that page, so maybe you don't have to download
Reply
#6
From what I understood the OP wants to access parameter values in form of a variable that can use in the program. He already used summary in the code he posted, but it just gives a printout of the values.
Reply
#7
After the command:
model = smf.ols('y ~ x1 + x2 +x3', df).fit()
gets run, the link under Model classes (that I provided) contains a
link to ols: http://www.statsmodels.org/stable/genera..._model.OLS
that page shows that:
>>> results.tvalues
array([ 1.87867287,  0.98019606])
>>> print(results.t_test([1, 0]))
<T test: effect=array([ 2.14285714]), sd=array([[ 1.14062282]]), t=array([[ 1.87867287]]), p=array([[ 0.05953974]]), df_denom=5>
contains the coefficients as well as other information.
I believe this is correct
Reply
#8
This seems to be output for T-values (there are also P-values) which is one of statistics involved in linear regression models. But it is not the regression function coefficients.
Reply
#9
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd
 
x0 = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]
y = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
 
def genList1(x, n, offset):
    list1 = []
    if (n + offset) <= len(x):
       list1 = x[offset:(offset + n)]
    return(list1)
 
x1 = genList1(x0, 20, 5)
x2 = genList1(x0, 20, 4)
x3 = genList1(x0, 20, 3)
 
xy = [('Y', y), ('x1', x1), ('x2', x2), ('x3', x3)]
df = pd.DataFrame.from_items(xy)
model = smf.ols('y ~ x1 + x2 +x3', df).fit()
print(model.summary())
print(f'model.params: {model.params}')
print('Done')
model.params returns:
Output:
model.params: Intercept   -1.000000 x1             -0.666667 x2              0.333333 x3             1.333333 dtype: float64
Reply
#10
model.params is the correct answer.
Thanks,
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to get coefficient of determination R2 after scipy.curve_fit? AlekseyPython 0 1,876 Feb-12-2021, 09:03 AM
Last Post: AlekseyPython
  how to calculate overlaping coefficient between two probablity functions Staph 3 3,745 Aug-11-2019, 08:10 AM
Last Post: Staph
  Help with correlation coefficient mattjb84 7 4,936 Jun-29-2018, 09:56 PM
Last Post: Larz60+
  Newbie question to return only the index of a dataframe zydjohn 0 2,561 Jan-22-2018, 03:40 PM
Last Post: zydjohn
  Newbie question: how to generate dataframe and use multiple regression zydjohn 0 2,287 Dec-10-2017, 09:49 AM
Last Post: zydjohn
  Newbie question on how to use pandas.rolling_mean zydjohn 5 14,251 Dec-09-2017, 08:42 PM
Last Post: j.crater
  Newbie question for using map, lambda zydjohn 2 3,408 Dec-09-2017, 07:18 PM
Last Post: zydjohn

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020