Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle.

Francisco Valverde-Albacete,Carmen Peláez-Moreno

doi:10.3390/e20070498

Francisco Valverde-Albacete, Carmen Peláez-Moreno

Open Access

https://doi.org/10.3390/e20070498

Copy DOI

Journal: Entropy (Basel, Switzerland)	Publication Date: Jun 27, 2018
Citations: 3	License type: CC BY 4.0

Affiliation: Carlos III University of Madrid

Abstract

Data transformation, e.g., feature transformation and selection, is an integral part of any machine learning procedure. In this paper, we introduce an information-theoretic model and tools to assess the quality of data transformations in machine learning tasks. In an unsupervised fashion, we analyze the transformation of a discrete, multivariate source of information into a discrete, multivariate sink of information related by a distribution . The first contribution is a decomposition of the maximal potential entropy of , which we call a balance equation, into its (a) non-transferable, (b) transferable, but not transferred, and (c) transferred parts. Such balance equations can be represented in (de Finetti) entropy diagrams, our second set of contributions. The most important of these, the aggregate channel multivariate entropy triangle, is a visual exploratory tool to assess the effectiveness of multivariate data transformations in transferring information from input to output variables. We also show how these decomposition and balance equations also apply to the entropies of and , respectively, and generate entropy triangles for them. As an example, we present the application of these tools to the assessment of information transfer efficiency for Principal Component Analysis and Independent Component Analysis as unsupervised feature transformation and selection procedures in supervised classification tasks.

Highlights

Information-related considerations are often cursorily invoked in many machine learning applications, sometimes to suggest why a system or procedure is seemingly better than another at a particular task
What we capitalize on is in the outstanding existence of a balance equation between these apparently simple entropic concepts, and what their intuitive meanings afford to the problem of measuring the transfer of information in data processing tasks
The first property that we would like to have is for this quantity to be a “transmitted information” after conditioning away any of the entropy of either partition, so we propose the following as a definition: IPXY = HPXY − V IPXY

Summary

Introduction

Information-related considerations are often cursorily invoked in many machine learning applications, sometimes to suggest why a system or procedure is seemingly better than another at a particular task. We set out to ground our work on measurable evidence phrases such as “this transformation retains more information from the data” or “this learning method uses the information from the data better than this other”. This has become relevant with the increase of complexity of machine learning methods, such as deep neuronal architectures [1], which prevents straightforward interpretations. Nowadays, these learning schemes almost always become black-boxes, where the researchers try to optimize a prescribed performance metric without looking inside. Some answers have started to appear [2,3], the issue is by no means settled

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy (Basel, Switzerland)

Lead the way for us

Similar Papers

A minimax probabilistic approach to feature transformation for multi-class data
Zhaohong Deng ... Fu-Lai Chung
Applied Soft Computing | VOL. 13
Zhaohong Deng, et. al.Zhaohong Deng ... Fu-Lai Chung
16 Aug 2012
Applied Soft Computing | VOL. 13

Heterogeneous feature subset selection using mutual information-based feature transformation
Min Wei ... Rosa H.M Chan
Neurocomputing | VOL. 168
Min Wei, et. al.Min Wei ... Rosa H.M Chan
14 Jun 2015
Neurocomputing | VOL. 168

Two information-theoretic tools to assess the performance of multi-class classifiers
Francisco J Valverde-Albacete ... Carmen Peláez-Moreno
Pattern Recognition Letters | VOL. 31
Francisco J Valverde-Albacete, et. al.Francisco J Valverde-Albacete ... Carmen Peláez-Moreno
21 May 2010
Pattern Recognition Letters | VOL. 31

Adaptive graph-based generalized regression model for unsupervised feature selection
Yanyong Huang ... Fengmao Lv
Knowledge-Based Systems | VOL. 227
Yanyong Huang, et. al.Yanyong Huang ... Fengmao Lv
26 May 2021
Knowledge-Based Systems | VOL. 227

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy (Basel, Switzerland)