Python Forum
Outputing LogisticRegression Coefficients (sklearn)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Outputing LogisticRegression Coefficients (sklearn)
#1
Good day,

I'm using the sklearn LogisticRegression class for some data analysis and am wondering how to output the coefficients for the predictors.

I'm using a Pipeline to standardize and power transform the data. Below is a snippit of the code. Not sure how to output the coefficients after this.

# fit a model (Pipeline - Normalization, LR)
steps = [('t1', StandardScaler()), ('t2', PowerTransformer()), ('m', LogisticRegression(solver='lbfgs', class_weight='balanced'))]
model = Pipeline(steps=steps)
model = model.fit(X, y)
Reply
#2
It is easy, just use named_steps attribute, e.g.
model.named_steps['m'].coef_
Reply
#3
Hi scidam,

Yeah, I was getting tripped up with the Pipeline and using model.coef_ which threw an error as Pipeline doesn't have such an attribute.

Glad to see there's an easy fix in this case. Sorry for the bother.

John
Reply
#4
Perhaps I can add an additional (but related) twist.

Now that I have the coefficients - is there additional outputs that show the standard error of those coefficients and their corresponding p-values?
Reply
#5
(Feb-22-2020, 01:00 PM)RawlinsCross Wrote: Now that I have the coefficients - is there additional outputs that show the standard error of those coefficients and their corresponding p-values?
Unfortunately, no. Scikit-learn doesn't provide p-values for logistic regression out-of-the-box. However, you can compute these values by applying some resampling technique (e.g. bootstrap); Also, take a look at statsmodels.
Reply
#6
Okay, imported the statsmodel module and got it to work. One question though about this - is the Logit class able to replicate certain features from LogisticRegression from sklearn.linear_model?

Specifically, I'm looking to replicate the LogisticRegression line:

# fit a model (Pipeline - Normalization, PowerTransform, LR)
steps = [('t1', MinMaxScaler()), ('t2', PowerTransformer()), ('m', LogisticRegression(solver='lbfgs', class_weight='balanced'))]
model = Pipeline(steps=steps)
It's the class_weight parameter I want to duplicate in statsmodel as the data is imbalanced. Might you know the stats model equivalent?

# statsmodel attempt
scaler = MinMaxScaler()
X = scaler.fit_transform(X)
pt = PowerTransformer()
X = pt.fit_transform(X)
logit = sm.Logit(y, X)
result = logit.fit()
print(result.summary())
Reply
#7
Could I use the example from?...
https://stackoverflow.com/questions/2792...regression#

# Manual P-Values 
lr = LogisticRegression(solver='lbfgs', class_weight='balanced')
lr.fit(X, y)
params = np.append(lr.intercept_, lr.coef_)
predictions = lr.predict(X)

newX = pd.DataFrame({"Constant":np.ones(len(X))}).join(pd.DataFrame(X))
MSE = (sum((y-predictions)**2))/(len(newX)-len(newX.columns))

var_b = MSE*(np.linalg.inv(np.dot(newX.T,newX)).diagonal())
sd_b = np.sqrt(var_b)
ts_b = params/ sd_b

p_values =[2*(1-stats.t.cdf(np.abs(i),(len(newX)-1))) for i in ts_b]

sd_b = np.round(sd_b,3)
ts_b = np.round(ts_b,3)
p_values = np.round(p_values,3)
params = np.round(params,4)

myDF3 = pd.DataFrame()
myDF3["Coefficients"],myDF3["Standard Errors"],myDF3["t values"],myDF3["Probabilites"] = [params,sd_b,ts_b,p_values]
print(myDF3)
Output:
Coefficients Standard Errors t values Probabilites 0 -0.3453 0.018 -19.285 0.00 1 -0.3326 0.021 -15.983 0.00 2 -0.4929 0.019 -26.082 0.00 3 0.8400 0.021 40.312 0.00 4 -0.2889 0.025 -11.465 0.00 5 -0.2708 0.026 -10.336 0.00 6 0.3760 0.048 7.854 0.00 7 0.0909 0.035 2.566 0.01 8 0.9340 0.055 16.992 0.00 9 -0.4504 0.041 -10.987 0.00
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Column Transformer with Mixed Types - sklearn aaldb 0 330 Feb-22-2024, 03:27 PM
Last Post: aaldb
  AR roots for VAR coefficients Scott 2 1,051 Nov-30-2022, 09:23 PM
Last Post: Scott
  Neural Network importance weights / coefficients jkaustin 1 2,060 Nov-10-2020, 07:44 PM
Last Post: jefsummers
  sklearn.neural_network MLPClassifier forecast variances CK1960 1 1,813 Oct-29-2020, 10:13 AM
Last Post: CK1960
  Customizing an sklearn submodule with cython JHogg11 0 1,959 May-27-2020, 05:39 PM
Last Post: JHogg11
  sklearn and train_test_split nsadams87xx 1 1,826 Apr-23-2020, 05:32 PM
Last Post: jefsummers
  Error When Using sklearn Predict Function firebird 0 2,056 Mar-21-2020, 04:34 PM
Last Post: firebird
  fit each group and extract coefficients Progressive 1 2,921 Jul-20-2019, 08:20 AM
Last Post: scidam
  Predicting an output variable with sklearn Ccross1 1 2,519 Jun-04-2019, 03:11 PM
Last Post: michalmonday
  sklearn regression to excel punksnotdead 1 2,761 Apr-14-2019, 12:32 PM
Last Post: punksnotdead

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020