In statsmodels, there are several different but similar functions to fit an ARIMA model to the data. We have the ARIMA, SARIMAX and AutoReg classes. While they are all usable, and they are all fit using OLS, the model they produced are slightly different and hence the fitted parameters are not comparable.

ARIMA (AR(1) as example)

\[\begin{aligned} y_t &= \bar{y} + \epsilon_t \\ \epsilon_t &= \rho \epsilon_{t-1} + \eta_t \end{aligned}\]

SARIMAX model

\[\begin{aligned} y_t &= \phi + \rho y_{t-1} + \eta_t \end{aligned}\]

hence \(\phi = \bar{y}(1-\rho)\) as we can see \(\bar{y}\) in ARIMA() is related to \(\phi\) in SARIMAX() by the sum of G.P.,

\[\frac{\phi}{1-\rho} = \bar{y}\]

If we have exogeneous variable, the SARIMAX model is

\[y_t - \beta x_t = \delta \rho (y_{t-1} - \beta x_{t-1}) + \eta_t\]

but ARIMA model is

\[\begin{aligned} y_t &= \delta + \beta x_t + \epsilon_t \\ \epsilon_t &= \rho \epsilon_{t-1} + \eta_t \end{aligned}\]

and AutoReg version is

\[y_t = \phi + \rho y_{t-1} + \beta x_t + \eta_t\]

which in the AutoReg model, \(y_t\) depends on the entire history of \(x_t\), while in SARIMAX model, only depends on \(x_t\) but not any other lags.