Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data

Umair F Siddiqi,Okyay Kaynak,Sadiq M Sait

doi:10.1109/access.2020.2964803

Umair F Siddiqi, Okyay Kaynak + Show 1 more

Open Access

https://doi.org/10.1109/access.2020.2964803

Copy DOI

Abstract

Filters are the fastest among the different types of feature selection methods. They employ metrics from information theory, such as mutual information (MI), Joint-MI (JMI), and minimal redundancy and maximal relevance (mRMR). The determination of the optimal feature selection set is an NP-hard problem. This work proposes the engineering of the Genetic Algorithm (GA) in which the fitness of solutions consists of two terms. The first is a feature selection metric such as MI, JMI, and mRMR, and the second term is the overlapping-coefficient that accounts for the diversity in the GA population. Experimental results show that the proposed algorithm can return multiple good quality solutions that also have minimal overlap with each other. Numerous solutions provide significant benefits when the test data contains none or missing values. Experiments were conducted using two publicly available time-series datasets. The feature sets are also applied to perform forecasting using a simple Long Short-Term Memory (LSTM) model, and the solution quality of the forecasting using different feature sets is analyzed. The proposed algorithm was compared with a popular optimization tool `Basic Open-source Nonlinear Mixed INteger programming' (BONMIN), and a recent feature selection algorithm `Conditional Mutual Information Considering Feature Interaction' (CMFSI). The experiments show that the multiple solutions found by the proposed method have good quality and minimal overlap.

Highlights

Time-series data contains observations recorded at regular time intervals
We employed multiple metrics (MI, JMI, and minimal redundancy and maximal relevance (mRMR)) because all of these can contribute to reliable forecasting, and we demonstrate that our proposed heuristic remains useful with all of these metrics
Some popular information-theory based metrics used in feature selection methods are: (i) mutual information (MI) [19]; (ii) JMI [20], and (iii) mRMR [21]

Summary

INTRODUCTION

Time-series data contains observations recorded at regular time intervals. The record may contain one (univariate) or multiple variables (multivariate). A power distribution company may want to predict the power demand for the few minutes, months, or even years, to adjust its generation capability Simple methods such as linear regression can do forecasting, they are usually not as reliable. Filter based methods often employ techniques from information theory such as Mutual Information (MI), Joint Mutual Information (JMI), and Conditional Mutual Information (CMI) These techniques are instrumental in selecting a subset of features that improve the quality of the DL model. The application of population-based metaheuristics have an advantage that they provide robustness again noise and missing data by determining alternate feature selection sets of almost equal quality.

RELATED WORK

PROPOSED HEURISTIC

TIME COMPLEXITY ANALYSIS

EXPERIMENTAL RESULTS

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 11	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Gender Classification Based on Fusion of Different Spatial Scale Features Selected by Mutual Information From Histogram of LBP, Intensity, and Shape
Juan E Tapia ... Claudio A Perez
IEEE Transactions on Information Forensics and Security | VOL. 8
Juan E Tapia, et. al.Juan E Tapia ... Claudio A Perez
01 Mar 2013
IEEE Transactions on Information Forensics and Security | VOL. 8

Feature Selection with Conditional Mutual Information Considering Feature Interaction
Jun Liang ... Liang Hou
Symmetry | VOL. 11
Jun Liang, et. al.Jun Liang ... Liang Hou
02 Jul 2019
Symmetry | VOL. 11

Mutual Information Based Feature Selection for Fingerprint Identification
Ahlem Adjimi ... Philippe Ravier
Informatica | VOL. 43
Ahlem Adjimi, et. al.Ahlem Adjimi ... Philippe Ravier
15 Jun 2019
Informatica | VOL. 43

Gender Classification Using One Half Face and Feature Selection Based on Mutual Information
Juan E Tapia ... Claudio A Perez
-
Juan E Tapia, et. al.Juan E Tapia ... Claudio A Perez
01 Oct 2013
01 Oct 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access