Jun-24-2021, 11:17 PM
I've been able to use the statsmodels.api regression when assigning variables to x and y with no issues, however, now I am trying to use the statsmodels.formula.api to to run a multiple regression that includes 1 categorical variable while utilizing the formual= function. I'm familiar with regression models in R, but now I'm switching over to Python and running into issues. I keep getting the following error:
File "<unknown>, Line 1
C(Work Country)
SyntaxError: invalid syntax
The code I am running that is causing the error is below:
import pandas
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import openpyxl
import statsmodels.formula.api as smf
import statsmodels.formula.api as ols
df = pd.read_excel('C:/File/data1')
model = smf.ols(formula= 'Age ~ C(Work Country) + Height', data = df).fit()
Any help would be grateful
File "<unknown>, Line 1
C(Work Country)
SyntaxError: invalid syntax
The code I am running that is causing the error is below:
import pandas
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import openpyxl
import statsmodels.formula.api as smf
import statsmodels.formula.api as ols
df = pd.read_excel('C:/File/data1')
model = smf.ols(formula= 'Age ~ C(Work Country) + Height', data = df).fit()
Any help would be grateful