Abstract

Entropy measures have been a major interest of researchers to measure the information content of a dynamical system. One of the well-known methodologies is sample entropy, which is a model-free approach and can be deployed to measure the information transfer in time series. Sample entropy is based on the conditional entropy where a major concern is the number of past delays in the conditional term. In this study, we deploy a lag-specific conditional entropy to identify the informative past values. Moreover, considering the seasonality structure of data, we propose a clustering-based sample entropy to exploit the temporal information. Clustering-based sample entropy is based on the sample entropy definition while considering the clustering information of the training data and the membership of the test point to the clusters. In this study, we utilize the proposed method for transductive feature selection in black-box weather forecasting and conduct the experiments on minimum and maximum temperature prediction in Brussels for 1–6 days ahead. The results reveal that considering the local structure of the data can improve the feature selection performance. In addition, despite the large reduction in the number of features, the performance is competitive with the case of using all features.

Highlights

  • Entropy measures have been used for many years to exploit the amount of information that a system contains

  • We have used the implementation of Automatic Relevance Determination (ARD) in the framework of Least Squares Support Vector Machines (LSSVM) (LSSVM Toolbox Version 1.8, KU Leuven, Leuven, Belgium) [39]

  • In addition to LSSVM as an Nonlinear AutoRegressive eXogenous (NARX) machine learning approach, we have investigated the impact of the feature selection on a linear approach such as the AutoRegressive with eXogenous input (ARX)

Read more

Summary

Introduction

Entropy measures have been used for many years to exploit the amount of information that a system contains. They play a significant role in interpreting and describing the dynamics of real-life complex networks such as climate, financial, physiological, Earth and medical systems [1,2,3,4,5,6]. Since in many real-life applications, the probability distribution of the data is unknown, in this study, we use a model-free approach known as sample entropy, which is one of the popular methods for analyzing the complexity of a dynamical system. One major concern while using conditional entropy is the number of previous values, known as lag or delay, in the conditioning term.

Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.