# ssarima() - State-Space ARIMA

#### 2017-02-19

SSARIMA stands for “State-space ARIMA” or “Several Seasonalities ARIMA”. Both names show what happens in the heart of the function: it constructs ARIMA in a state-space form and allows to model several (actually more than several) seasonalities. ssarima() is a function included in smooth package. This vignette covers ssarima() and auto.ssarima() functions.

As usual, we will use data from Mcomp package, so it is adviced to install it.

require(smooth)
require(Mcomp)

The default call constructs ARIMA(0,1,1):

ssarima(M3$N2457$x, h=18)
## Time elapsed: 0.03 seconds
## Model estimated: ARIMA(0,1,1)
## Matrix of MA terms:
##        Lag 1
## MA(1) -0.794
## Initial values were produced using backcasting.
## 2 parameters were estimated in the process
## Residuals standard deviation: 2116.361
## Cost function type: MSE; Cost function value: 4401090
##
## Information criteria:
##      AIC     AICc      BIC
## 2089.553 2089.660 2095.042

Some more complicated model can be defined using parameter orders the following way:

ssarima(M3$N2457$x, orders=list(ar=c(0,1),i=c(1,0),ma=c(1,1)),lags=c(1,12),h=18)
## Time elapsed: 0.07 seconds
## Model estimated: SARIMA(0,1,1)[1](1,0,1)[12]
## Matrix of AR terms:
##       Lag 12
## AR(1)  0.786
## Matrix of MA terms:
##        Lag 1 Lag 12
## MA(1) -0.815 -0.319
## Initial values were produced using backcasting.
## 4 parameters were estimated in the process
## Residuals standard deviation: 1935.435
## Cost function type: MSE; Cost function value: 3615618
##
## Information criteria:
##      AIC     AICc      BIC
## 2070.945 2071.308 2081.925

This would construct us seasonal ARIMA(0,1,1)(1,0,1)_12.

We could try selecting orders manually, but this can also be done automatically via auto.ssarima() function:

auto.ssarima(M3$N2457$x, h=18)
## Estimation progress:     0%1%1%2%3%3%4%5%5%6%9%9%10%12%12%13%14%17%17%18%19%19%20%20%22%22%23%25%26%27%28%29%30%31%34%50%51%51%52%53%53%53%54%55%55%56%56%59%59%60%61%67%83%100%... Done!
## Time elapsed: 4.47 seconds
## Model estimated: SARIMA(0,1,2)[1](0,0,2)[12] with drift
## Matrix of MA terms:
##        Lag 1 Lag 12
## MA(1) -0.598  0.436
## MA(2) -0.320  0.534
## Constant value is: 56.093
## Initial values were produced using backcasting.
## 6 parameters were estimated in the process
## Residuals standard deviation: 1806.293
## Cost function type: MSE; Cost function value: 3092468
##
## Information criteria:
##      AIC     AICc      BIC
## 2056.971 2057.749 2073.441

Automatic order selection in SSARIMA with optimised initials does not work well and in general is not recommended. This is partially because of the possible high number of parameters in some models and partially because of potential overfitting of first observations when non-zero order of AR is selected. This problem can be seen on example of another time series (which has complicated seasonality):

auto.ssarima(M3$N1683$x, h=18, initial="backcasting")
## Estimation progress:     0%1%1%2%3%3%9%9%10%12%17%34%50%67%83%100%... Done!
## Time elapsed: 2.9 seconds
## Model estimated: SARIMA(0,0,3)[1](3,0,0)[12] with constant
## Matrix of AR terms:
##       Lag 12
## AR(1)  0.162
## AR(2)  0.332
## AR(3)  0.173
## Matrix of MA terms:
##       Lag 1
## MA(1) 0.221
## MA(2) 0.173
## MA(3) 0.235
## Constant value is: 1311.342
## Initial values were produced using backcasting.
## 8 parameters were estimated in the process
## Residuals standard deviation: 390.997
## Cost function type: MSE; Cost function value: 141554
##
## Information criteria:
##      AIC     AICc      BIC
## 1603.418 1604.873 1624.875
auto.ssarima(M3$N1683$x, h=18, initial="optimal")
## Estimation progress:     0%1%1%2%3%9%17%34%50%67%83%100%... Done!
## Time elapsed: 4.73 seconds
## Model estimated: ARIMA(0,0,3) with constant
## Matrix of MA terms:
##       Lag 1
## MA(1) 0.356
## MA(2) 0.293
## MA(3) 0.310
## Constant value is: 3786.814
## Initial values were optimised.
## 8 parameters were estimated in the process
## Residuals standard deviation: 427.661
## Cost function type: MSE; Cost function value: 169346
##
## Information criteria:
##      AIC     AICc      BIC
## 1622.778 1624.233 1644.235

As can be seen from the second graph, ssarima with optimal initial does not select seasonal model and reverts to ARIMA(0,0,3) with constant. In theory this can be due to implemented order selection algorithm, however if we estimate all the model in the pool separately, we will see that this model is optimal for this time series when this type of initials is used.

Now let’s introduce some artificial exogenous variables:

x <- cbind(rnorm(length(M3$N2457$x),50,3),rnorm(length(M3$N2457$x),100,7))

If we save model:

ourModel <- auto.ssarima(M3$N2457$x, h=18, holdout=TRUE, xreg=x, updateX=TRUE)
## Estimation progress:     0%1%1%2%3%4%5%6%9%9%10%12%12%13%14%17%34%50%67%83%100%... Done!

we can then reuse it:

ssarima(M3$N2457$x, model=ourModel, h=18, holdout=FALSE, xreg=x, updateX=TRUE, intervals=TRUE)
## 50%100%Done!
## Time elapsed: 0.38 seconds
## Model estimated: SARIMAX(0,0,2)[1](0,0,2)[12] with constant
## Matrix of MA terms:
##       Lag 1 Lag 12
## MA(1) 0.107  0.101
## MA(2) 0.101  0.100
## Constant value is: 3166.51
## Initial values were provided by user.
## 40 parameters were estimated in the process
## Residuals standard deviation: 3056.946
## Xreg coefficients were estimated in a crazy style
## Cost function type: MSE; Cost function value: 6094511
##
## Information criteria:
##      AIC     AICc      BIC
## 2202.989 2247.314 2312.787
## 95% parametric prediction intervals were constructed

Finally, we can combine several SARIMA models:

ssarima(M3$N2457$x, h=18, holdout=FALSE, intervals=TRUE, combine=TRUE)
## Time elapsed: 0.03 seconds
## Model estimated: ARIMA(0,1,1)
## Matrix of MA terms:
##        Lag 1
## MA(1) -0.794
## Initial values were produced using backcasting.
## 2 parameters were estimated in the process
## Residuals standard deviation: 2116.361
## Cost function type: MSE; Cost function value: 4401090
##
## Information criteria:
##      AIC     AICc      BIC
## 2089.553 2089.660 2095.042
## 95% parametric prediction intervals were constructed