In this paper, we forecast the dynamics of unemployment in Russia using several machine learning methods: random forest, gradient boosting, elastic net, and neural networks. The scientific contribution of this paper is threefold. First, along with feed-forward, fully connected neural networks, we use sequence-to-sequence model recurrent neural networks, which take the time-series structure of the sample dataset into account. Second, in addition to univariate long short-term memory models, we include additional macroeconomic indicators in order to estimate multivariate recurrent neural networks. Third, the model evaluation process considers revisions of statistical information in real-time datasets. In order to increase the model’s predictive performance, we use additional unstructured indicators: search queries and news indices. Relative to the structural model of unemployment dynamics, the mean absolute forecast error for one month ahead is reduced by 65%, to 0.12 percentage points of the unemployment rate in the recurrent neural networks and long short-term memory models, and by 56%, to 0.14 percentage points in the modified gradient boosting algorithms. When accounting for revisions of statistical information, further reduction of the root-mean-square error by the models proposed is revealed, which highlights the importance of accounting for possible changes in the calculation of the values of macroeconomic indicators.
Read full abstract