Abstract

In this work, a cross-validation procedure is used to identify an appropriate Autoregressive Integrated Moving Average model and an appropriate state space model for a time series. A minimum size for the training set is specified. The procedure is based on one-step forecasts and uses different training sets, each containing one more observation than the previous one. All possible state space models and all ARIMA models where the orders are allowed to range reasonably are fitted considering raw data and log-transformed data with regular differencing (up to second order differences) and, if the time series is seasonal, seasonal differencing (up to first order differences). The value of root mean squared error for each model is calculated averaging the one-step forecasts obtained. The model which has the lowest root mean squared error value and passes the Ljung–Box test using all of the available data with a reasonable significance level is selected among all the ARIMA and state space models considered. The procedure is exemplified in this paper with a case study of retail sales of different categories of women’s footwear from a Portuguese retailer, and its accuracy is compared with three reliable forecasting approaches. The results show that our procedure consistently forecasts more accurately than the other approaches and the improvements in the accuracy are significant.

Highlights

  • Time series often exhibit strong trends and seasonal variations presenting challenges in developing effective forecasting models

  • Models selected by the cross-validation procedure developed, we evaluated the forecasts from another three forecasting approaches: the Hyndman and Khandakar [23] algorithm which identifies and estimates Autoregressive Integrated MovingAverage (ARIMA) models, the Hyndman and Athanasopoulos [16] statistical framework which identifies and estimates state space models and the seasonal naïve method which was used as benchmark, despite simple forecasting methods being sometimes surprisingly effective

  • The mean absolute percentage error (MAPE) consistently ranks differently, which reinforces the impact that its limitations can have on the results

Read more

Summary

Introduction

Time series often exhibit strong trends and seasonal variations presenting challenges in developing effective forecasting models. The training set is used for estimating the model and the test set is used to measure how well the model is likely to forecast on new data This approach is exemplified in the paper with a case study of retail sales time series of different categories of women’s footwear from a Portuguese retailer that, by exhibiting complex patterns, present challenges in developing effective forecasting models. Aggregate time series are usually preferred because they contain both trends and seasonal patterns, providing a good testing ground for developing forecasting methods, and because companies can benefit from more accurate forecasts. We present a brief description of the state space models and the ARIMA models and introduces the usual forecast error measures.

State Space Models
ARIMA Models
Forecast Error Measures
Model Identification
Empirical Study
Results
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.