Abstract

This paper presents a feature construction approach called Statistical Feature Construction (SFC) for time series prediction. Creation of new features is based on statistical characteristics of analyzed data series. First, the initial data are transformed into an array of short pseudo-stationary windows. For each window, a statistical model is created and characteristics of these models are later used as additional features for a single window or as time-dependent features for the entire time series. To demonstrate the effect of SFC, five plasma physics and six oceanographic time series were analyzed. For each window, unknown distribution parameters were estimated with the method of moving separation of finite normal mixtures. First four statistical moments of these mixtures for initial data and increments were used as additional data features. Multi-layer recurrent neural networks were trained to create short- and medium-term forecasts with a single window as input data; additional features were used to initialize the hidden state of recurrent layers. A hyperparameter grid-search was performed to compare fully-optimized neural networks for original and enriched data. A significant decrease in RMSE metric was observed with a median of 11.4%. There was no increase in RMSE metric in any of the analyzed time series. The experimental results have shown that SFC can be a valuable method for forecasting accuracy improvement.

Highlights

  • Forecasting of real-world processes can be limited by the amount of information that can be reasonably collected

  • The choice of LSTM recurrent layers provided for better results than the use of Gated Recurrent Units (GRU)/RNN layers

  • The paper presents a statistical approach to data modeling and feature construction with applications for two different sets of data

Read more

Summary

Introduction

Forecasting of real-world processes can be limited by the amount of information that can be reasonably collected. These stated conditions call for the need for research of probability mixture models for distributions of the observed processes [1]. A wide class of distributions with the form of H(x) = EP[F(x, y)] is usually chosen as the base family [2,3]. EP denotes the mathematical expectation with respect to some probability measure P, which defines a mixing distribution. It is usually determined through the analysis of external factors behavior. F(x, y) is a distribution function with a random vector of parameters y that is called a mixing (kernel) distribution

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call