Oct-18-2023, 07:09 PM
Dear all,
I’m beginning with Python that I need to use to run a linear model for the dataset below :
Y ~ X1 + X2
X1 and X2 are categorical variables
And for that the following code gave me exactly what I need :
Any help is highly appreciated.
Thanks a lot in advance.
I’m beginning with Python that I need to use to run a linear model for the dataset below :
Output:Location Y X1 X2
1 32 1 1
1 44 1 2
1 58 1 3
1 76 2 1
1 73 2 2
1 37 2 3
1 52 3 1
1 78 3 2
1 60 3 3
2 93 1 1
2 78 1 2
2 25 1 3
2 97 2 1
2 85 2 2
2 60 2 3
2 70 3 1
2 62 3 2
2 95 3 3
My target is to run a linear model as follows : Y ~ X1 + X2
X1 and X2 are categorical variables
And for that the following code gave me exactly what I need :
import numpy as np import pandas as pd import statsmodels.api as sm import matplotlib.pyplot as plt from statsmodels.formula.api import ols import scipy.stats as stats df = pd.DataFrame(dataset) reg = ols('Y ~ C(X1) + C(X2)', data=df).fit() df['fitted_values'] = reg.fittedvalues result = reg.outlier_test() df['student_resid'] = result.student_residWhat I’m not able to do is to run this code by ‘Location’, and get my columns 'fitted_values' and 'student_resid' accordingly.
Any help is highly appreciated.
Thanks a lot in advance.