Abstract
Deep neural networks and gradient boosted tree models have swept across the field of machine learning over the past decade, producing across-the-board advances in performance. The ability of these methods to capture feature interactions and nonlinearities makes them exceptionally powerful and, at the same time, prone to overfitting, leakage, and a lack of generalization in domains with target non-stationarity and collinearity, such as time-series forecasting. We offer guidance to address these difficulties and provide a framework that maximizes the chances of predictions that generalize well and deliver state-of-the-art performance. The techniques we offer for cross-validation, augmentation, and parameter tuning have been used to win several major time-series forecasting competitions—including the M5 Forecasting Uncertainty competition and the Kaggle COVID19 Forecasting series—and, with the proper theoretical grounding, constitute the current best practices in time-series forecasting.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have