Abstract

State-of-the-art multivariate forecasting methods are restricted to low dimensional tasks, linear dependencies and short horizons. The technological advances (notably the Big data revolution) are instead shifting the focus to problems characterized by a large number of variables, non-linear dependencies and long forecasting horizons. In the last few years, the majority of the best performing techniques for multivariate forecasting have been based on deep-learning models. However, such models are characterized by high requirements in terms of data availability and computational resources and suffer from a lack of interpretability. To cope with the limitations of these methods, we propose an extension to the DFML framework, a hybrid forecasting technique inspired by the Dynamic Factor Model (DFM) approach, a successful forecasting methodology in econometrics. This extension improves the capabilities of the DFM approach, by implementing and assessing both linear and non-linear factor estimation techniques as well as model-driven and data-driven factor forecasting techniques. We assess several method integrations within the DFML, and we show that the proposed technique provides competitive results both in terms of forecasting accuracy and computational efficiency on multiple very large-scale (>102 variables and > 103 samples) real forecasting tasks.

Highlights

  • The pervasiveness of interconnected devices (IoT) and the consequent big data revolution are shifting the focus of forecasting to problems characterized by very large dimensionality (n > 100, . . . , 1,000), non-linear cross-series dependencies and long forecasting horizons

  • We present the results in two formats: 1) a critical difference (CD) plot highlighting the statistical significance over all horizons and 2) a tabular format, containing the NNMSE values for different horizons and grouping the methods according to three categories: DF-Stat denoting Dynamic Factor Model (DFM) approaches with statistical forecasting, DF-ML denoting DFM approaches with machine learning forecasting and UNI-Stat denoting univariate statistical baselines

  • Taking into considerations all horizons DFML is significantly better than UNI-STAT: for large horizons, the accuracy of UNI-STAT and DFML techniques tend to converge

Read more

Summary

Introduction

The pervasiveness of interconnected devices (IoT) and the consequent big data revolution are shifting the focus of forecasting to problems characterized by very large dimensionality (n > 100, . . . , 1,000), non-linear cross-series dependencies and long forecasting horizons. 1,000), non-linear cross-series dependencies and long forecasting horizons. Most multivariate forecasting methods in the literature are restricted to low dimension (n < 10) vector time series, linear forecasting techniques and short horizons. The most common approaches to multivariate forecasting are model-driven and data-driven (Januschowski et al, 2020). Model-driven approaches include vector regressions (VAR, VARMA, VARIMA, VARMAX) (Lütkepohl, 2005) and kernel-based regression (Exterkate et al, 2016). Vector AutoRegressive (VAR) models showed a good capability in capturing linear dependencies in applied domains (e.g. wind farm) (Cavalcante et al, 2017). The main VAR-based model drawback is the parameter size growth at the increase of the lag sample and dimension of the task.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call