object.title
Background on time series modeling and forecasting (2/3)

Time series modeling carefully collects and studies the past observations expressed as a time series for the purpose of developing an appropriate model which describes the inherent structure of the series. One of the most frequently used stochastic time series models is the Autoregressive Integrated Moving Average (ARIMA), whose popularity is mainly due to its flexibility to represent several varieties of time series with simplicity.

Autoregressive Integrated Moving Average (ARIMA) process for univariate time series

ARIMA1 is a class of generalized model that captures temporal structure in time series data. For this purpose, ARIMA combines Auto Regressive process (AR) and Moving Average (MA) processes so as to build a composite model of the time series. In particular, ARIMA forecasts the next values using auto regression with some parameters fitted to the model. Then, AMIRA applies a moving average with a set of parameters. During the autoregression, the variable of interest  𝑦𝑡 is forecasted using a linear combination of past values of the variable 𝑦𝑡-1, 𝑦𝑡-2, ... , αp𝑦𝑡-p. The Autoregressive term is written as:

𝑦𝑡 = c + α1𝑦𝑡-1+ α2𝑦𝑡-2 + ... + αp𝑦𝑡-p + ε𝑡

where c is a constant, αi (i= 1, 2,...,p) is the model parameter that needs to be discovered, 𝑦𝑡-1(i= 1,2,...,p) are the lagged values of 𝑦t and ε𝑡 is the white noise.

The moving average term 𝑦𝑡 can be expressed based on the past forecast errors (rather than using past values): 

𝑦𝑡 = u + ϴ1 ε𝑡-1ϴ2 ε𝑡-2+ ... + ϴq ε𝑡-q + ε𝑡

where u is a constant, ϴi (i= 1,2,...,q) are the model parameters, ε𝑡-i are random shocks at time period t-i (i= 1,2,...,q) and ε𝑡 is white noise.

Overall, the autoregressive (AR), moving average (MA) and Integration models are effectively combined to form a class of time series models, called ARIMA (with 𝑦’𝑡 representing the differenced time series), which is expressed as:   

𝑦’𝑡 = c + α1𝑦𝑡-1+ α2𝑦𝑡-2 + ... + αp𝑦𝑡-p+ϴ1 ε𝑡-1ϴ2 ε𝑡-2 + ... + ϴq ε𝑡-q + ε𝑡

An important prerequisite is to check whether a time series is stationary (constant mean and variance) through plotting and root testing using augmented Dickey-Fuller1 or Philips-Perron2 unit root test.  If the time series is not stationary, it can be made stationary by differencing the time series3.

The best parameters are found using the Box-Jenkins method4, which is a three-step approach that consists in:

  • identifying the model to ensure that the variables are stationary and selecting parameters based on the Autocorrelation Function (AFC)6 for the MA terms and the Partial Autocorrelation Function (PACF)5 for the AR terms.
  • estimating the parameters (α and ϴ) that best fit the ARIMA model based on e.g. maximum likelihoodor the nonlinear least square7. Among candidate models, the best suited model is the one that has the best AIC or BIC value 8.
  • Statistical model checking lying in studying if the residual is white noise and has a constant mean and variance over time. If these assumptions are not satisfied a more appropriate model needs to be fitted.

If all the assets are satisfied, the future values can be forecasted according to the model. The ARIMA model has been generalized by Box and Jenkins to deal with seasonality.

Seasonal Autoregressive Integrated Moving Average (SARIMA) process for univariate time series

Seasonal ARIMA (SARIMA)10 deals with a seasonal component in univariate time series. In addition to the autoregression (AR), differencing (I) and moving average (MA), SARIMA accounts for the seasonal component of the time series leveraging additional parameters for the period of the seasonality. The SARIMA model is hence represented as SARIMA(p,d,q)(P,D,Q)m where P defines the order of the seasonal AR term, D the order of the seasonal Integration term, Q the order of the seasonal MA term and M the seasonal factor.

Vector Autoregressive Integrated Moving Average (VARMA) process for multivariate time series

Contrary to the ARIMA model, which is fitted for univariate time series, VARMA(p,q)10 deals with  multiple time series that may influence each other. For each time series, we regress a variable on p lags of itself and all the other variables and so on for the q parameter. Given k time 𝑦𝑡-1, 𝑦𝑡-2, …, 𝑦kt series expressed as a vector V𝑡 = [𝑦𝑡-1, 𝑦𝑡-2, …, 𝑦kt] , VARMA(p,q) models is defined by the Var and Ma models:

Equation. 1 VARMA matrix notation

where ck matrix is a constants vector, α𝑡i,j  (i,j=\ 1,2,...,k) and ϴ ij (i,j=\ 1,2,...,k) matrixes are the model parameters and k is the number of time series, 𝑦k,𝑡-1(,i= 1,2,...,p) are the lagged values matrix and the cross variables dependency. εk,t-q (i= 1,2,...,q) are the matrix of random shocks and Φkt is white noise vector with zero mean and constant covariance matrix.

In the following, we will use this family of models to model and predict the behavior of the NVF/CNF system and detect anomalies.

If you have missed the first part, here you can read the introduction

https://theexpert.squad.fr/theexpert/digital/devops/introduction-to-machine-learning-1-3/

References ⤵

  • [1] Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American statistical association, 74(366a), 427-431
  • [2] Phillips, Peter & Perron, Pierre. (1986). Testing for a Unit Root in Time Series Regression. Cowles Foundation, Yale University, Cowles Foundation Discussion Papers. 75. 10.1093/biomet/75.2.335
  • [3] Nason, G. P. (2006). Stationary and non-stationary time series. Statistics in volcanology, 60.
  • [4] Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and control. John Wiley & Sons.
  • [5] Watson, P. K., & Teelucksingh, S. S. (2002). A practical introduction to econometric methods: Classical and modern. University of West Indies Press
  • [6] Myung, I. J. (2003). Tutorial on maximum likelihood estimation. Journal of mathematical Psychology, 47(1), 90-100.
  • [7] Hartley, H. O., & Booker, A. (1965). Nonlinear least squares estimation. Annals of Mathematical Statistics, 36(2), 638-650.
  • [8] Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle. In Selected papers of hirotugu akaike (pp. 199-213). Springer, New York, NY.
  • [9] Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.
  • [10] Brockwell, P. J., Brockwell, P. J., Davis, R. A., & Davis, R. A. (2016). Introduction to time series and forecasting. springer.