Aug-06-2020, 02:35 AM
(Aug-05-2020, 12:02 PM)russoj5 Wrote: Its not really relevant to my problem, but do you know why it would work when using matplotlib module and not with the statsmodels module?
There is no magic with
-
character, it is just one character of a string. Statsmodels internally parses the formula given. It looks for +
, ~
and -
symbols in a string representing the formula. When statsmodels find -
, it treats surrounding alphanumeric substrings C
and IC_6MU
as factor/column names (but your data frame doesn't have such columns). All this behavior is implemented in statsmodels to get it closer to R-like (formula) syntax.When you call
plt.plot(chem_film_data_df['C-IC_6MU'], chem_film_data_df['pass_fail'],'o')
, only pandas selectionengine works: you get
chem_film_data_df['C-IC_6MU']
and chem_film_data_df['pass_fail'] which are iterables (Pandas.Series instances); and these iterables are passed to the plot
function.When AFAIK, Matplotlib doesn't perform similar parsing. How