Abstract

Filters are the fastest among the different types of feature selection methods. They employ metrics from information theory, such as mutual information (MI), Joint-MI (JMI), and minimal redundancy and maximal relevance (mRMR). The determination of the optimal feature selection set is an NP-hard problem. This work proposes the engineering of the Genetic Algorithm (GA) in which the fitness of solutions consists of two terms. The first is a feature selection metric such as MI, JMI, and mRMR, and the second term is the overlapping-coefficient that accounts for the diversity in the GA population. Experimental results show that the proposed algorithm can return multiple good quality solutions that also have minimal overlap with each other. Numerous solutions provide significant benefits when the test data contains none or missing values. Experiments were conducted using two publicly available time-series datasets. The feature sets are also applied to perform forecasting using a simple Long Short-Term Memory (LSTM) model, and the solution quality of the forecasting using different feature sets is analyzed. The proposed algorithm was compared with a popular optimization tool `Basic Open-source Nonlinear Mixed INteger programming' (BONMIN), and a recent feature selection algorithm `Conditional Mutual Information Considering Feature Interaction' (CMFSI). The experiments show that the multiple solutions found by the proposed method have good quality and minimal overlap.

Highlights

  • Time-series data contains observations recorded at regular time intervals

  • We employed multiple metrics (MI, JMI, and minimal redundancy and maximal relevance (mRMR)) because all of these can contribute to reliable forecasting, and we demonstrate that our proposed heuristic remains useful with all of these metrics

  • Some popular information-theory based metrics used in feature selection methods are: (i) mutual information (MI) [19]; (ii) JMI [20], and (iii) mRMR [21]

Read more

Summary

INTRODUCTION

Time-series data contains observations recorded at regular time intervals. The record may contain one (univariate) or multiple variables (multivariate). A power distribution company may want to predict the power demand for the few minutes, months, or even years, to adjust its generation capability Simple methods such as linear regression can do forecasting, they are usually not as reliable. Filter based methods often employ techniques from information theory such as Mutual Information (MI), Joint Mutual Information (JMI), and Conditional Mutual Information (CMI) These techniques are instrumental in selecting a subset of features that improve the quality of the DL model. The application of population-based metaheuristics have an advantage that they provide robustness again noise and missing data by determining alternate feature selection sets of almost equal quality.

RELATED WORK
PROPOSED HEURISTIC
TIME COMPLEXITY ANALYSIS
EXPERIMENTAL RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.