Abstract
This paper presents a feature construction approach called Statistical Feature Construction (SFC) for time series prediction. Creation of new features is based on statistical characteristics of analyzed data series. First, the initial data are transformed into an array of short pseudo-stationary windows. For each window, a statistical model is created and characteristics of these models are later used as additional features for a single window or as time-dependent features for the entire time series. To demonstrate the effect of SFC, five plasma physics and six oceanographic time series were analyzed. For each window, unknown distribution parameters were estimated with the method of moving separation of finite normal mixtures. First four statistical moments of these mixtures for initial data and increments were used as additional data features. Multi-layer recurrent neural networks were trained to create short- and medium-term forecasts with a single window as input data; additional features were used to initialize the hidden state of recurrent layers. A hyperparameter grid-search was performed to compare fully-optimized neural networks for original and enriched data. A significant decrease in RMSE metric was observed with a median of 11.4%. There was no increase in RMSE metric in any of the analyzed time series. The experimental results have shown that SFC can be a valuable method for forecasting accuracy improvement.
Highlights
Forecasting of real-world processes can be limited by the amount of information that can be reasonably collected
The choice of LSTM recurrent layers provided for better results than the use of Gated Recurrent Units (GRU)/RNN layers
The paper presents a statistical approach to data modeling and feature construction with applications for two different sets of data
Summary
Forecasting of real-world processes can be limited by the amount of information that can be reasonably collected. These stated conditions call for the need for research of probability mixture models for distributions of the observed processes [1]. A wide class of distributions with the form of H(x) = EP[F(x, y)] is usually chosen as the base family [2,3]. EP denotes the mathematical expectation with respect to some probability measure P, which defines a mixing distribution. It is usually determined through the analysis of external factors behavior. F(x, y) is a distribution function with a random vector of parameters y that is called a mixing (kernel) distribution
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.