Quadratic Mutual Information Feature Selection

Davor Sluga,Uroš Lotrič

doi:10.3390/e19040157

Abstract

We propose a novel feature selection method based on quadratic mutual information which has its roots in Cauchy–Schwarz divergence and Renyi entropy. The method uses the direct estimation of quadratic mutual information from data samples using Gaussian kernel functions, and can detect second order non-linear relations. Its main advantages are: (i) unified analysis of discrete and continuous data, excluding any discretization; and (ii) its parameter-free design. The effectiveness of the proposed method is demonstrated through an extensive comparison with mutual information feature selection (MIFS), minimum redundancy maximum relevance (MRMR), and joint mutual information (JMI) on classification and regression problem domains. The experiments show that proposed method performs comparably to the other methods when applied to classification problems, except it is considerably faster. In the case of regression, it compares favourably to the others, but is slower.

Highlights

Modelling data using machine learning approaches usually involves taking some kind of learning machine to train a model using already known input and output data
We evaluate the performance of the methods using the classification accuracy (CA), the area under the curve (AUC), Youden index (Y-index)—the difference between true positive rate (TPR) and false positive rate (FPR)—calculated in the optimal receiver operating characteristic (ROC) point, and the execution time
We propose a quadratic mutual information feature selection method (QMIFS)

Summary

Introduction

Modelling data using machine learning approaches usually involves taking some kind of learning machine (e.g., decision tree, neural network, support vector machine) to train a model using already known input and output data. Based on features collected about patients (gender, blood pressure, presence or absence of certain symptoms, etc.) and given the patients’ diagnoses (the outputs), we can build a model and use it afterwards as a diagnosis tool for new patients. Many classification or regression problems involve high-dimensional input data. Gene expression data can reach into tens of thousands of features [1]. The majority of these features are either irrelevant or redundant for the given classification or regression task. A large number of features can lead to poor inference performance, possible over-fitting of the model, and increased training time [2]

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Apr 1, 2017
Citations: 17	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Quadratic Mutual Information Feature Selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Direct Estimation of the Derivative of Quadratic Mutual Information with Application in Supervised Dimension Reduction.
Voot Tangkaratt ... Hiroaki Sasaki
Neural computation | VOL. 29
Voot Tangkaratt, et. al.Voot Tangkaratt ... Hiroaki Sasaki
09 Jun 2017
Neural computation | VOL. 29

Feature selection for acoustic events detection
Eva Kiktova-Vozarikova ... Anton Cizmar
Multimedia Tools and Applications | VOL. 74
Eva Kiktova-Vozarikova, et. al.Eva Kiktova-Vozarikova ... Anton Cizmar
12 Jun 2013
Multimedia Tools and Applications | VOL. 74

Computationally efficient mutual information estimation for non-rigid image registration
Ali Gholipour ... Nasser Kehtarnavaz
-
Ali Gholipour, et. al.Ali Gholipour ... Nasser Kehtarnavaz
01 Jan 2008
01 Jan 2008

Amharic Character Recognition Based on Features Extracted by CNN and Auto-Encoder Models
Efrem Yohannes Obsie ... Hongchun Qu
-
Efrem Yohannes Obsie, et. al.Efrem Yohannes Obsie ... Hongchun Qu
25 Jun 2021
25 Jun 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Quadratic Mutual Information Feature Selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy