Abstract

Knowledge discovery systems are nowadays supposed to store and process very large data. When working with big time series, multivariate prediction becomes more and more complicated because the use of all the variables does not allow to have the most accurate predictions and poses certain problems for classical prediction models. In this article, we present a scalable prediction process for large time series prediction, including a new algorithm for identifying time series predictors, which analyses the dependencies between time series using the mutual reinforcement principle between Hubs and Authorities of the Hits (Hyperlink-Induced Topic Search) algorithm. The proposed framework is evaluated on 3 real datasets. The results show that the best predictions are obtained using a very small number of predictors compared to the initial number of variables. The proposed feature selection algorithm shows promising results compared to widely known algorithms, such as the classic and the kernel principle component analysis, factor analysis, and the fast correlation-based filter method, and improves the prediction accuracy of many time series of the used datasets.

Highlights

  • Time series are sequential data that are generally used to model dynamic systems and processes

  • We demonstrated an application of the presented prediction process on three datasets, which allows to find the best reduction method and prediction model for each variable of the multivariate time series

  • Starting with discussing the evaluations, the results show that the proposed method provides competitive predictions in terms of usage, and in terms of average of root mean square error (RMSE) and mean absolute scaled error (MASE) compared to existing methods of type dimension reduction (PCA, kernel principal component analysis (PCA), factor analysis) and feature selection (FCBF)

Read more

Summary

Introduction

Time series are sequential data that are generally used to model dynamic systems and processes. A way of improving the forecast accuracy consists in developing new prediction models by changing the structure of existing models and how they analyze the history of data in order to make predictions. Another way seeks to focus on the other factors that influence the predictions, by considering this problem as a process where the application of the prediction models is just a step. The forecast accuracy can be improved by many ways, for instance, (i) determining the most optimized structures of the prediction models with respect to the underlying set of predictors, (ii) improving the quality of the input data, (iii) adopting model matching techniques, etc

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call