An Information Theoretic Interpretation to Deep Neural Networks.

Xiangxiang Xu,Gregory W Wornell,Lizhong Zheng,Shao-Lun Huang

doi:10.3390/e24010135

Xiangxiang Xu, Gregory W Wornell + Show 2 more

Open Access

https://doi.org/10.3390/e24010135

Copy DOI

Abstract

With the unprecedented performance achieved by deep learning, it is commonly believed that deep neural networks (DNNs) attempt to extract informative features for learning tasks. To formalize this intuition, we apply the local information geometric analysis and establish an information-theoretic framework for feature selection, which demonstrates the information-theoretic optimality of DNN features. Moreover, we conduct a quantitative analysis to characterize the impact of network structure on the feature extraction process of DNNs. Our investigation naturally leads to a performance metric for evaluating the effectiveness of extracted features, called the H-score, which illustrates the connection between the practical training process of DNNs and the information-theoretic framework. Finally, we validate our theoretical results by experimental designs on synthesized data and the ImageNet dataset.

Highlights

Due to the striking performance of deep learning in various application fields, deep neural networks (DNNs) have gained great attention in modern computer science
This paper aims to provide an information-theoretic interpretation to the feature extraction process in DNNs, to bridge the gap between the practical deep learning implementations and information-theoretic characterizations
The local information geometric method is closely related to the conventional Hirschfeld– Gebelein–Rényi (HGR) maximal correlation [25,26,27] problem, which has attracted increasing interest in the information theory community [28,29,30,31,32,33], and has been applied in data analysis [34] and privacy studies [35]

Summary

Introduction

Due to the striking performance of deep learning in various application fields, deep neural networks (DNNs) have gained great attention in modern computer science. The understanding of the feature extraction behind these models is relatively lacking, which poses challenges for their application in security-sensitive tasks, such as the autonomous vehicle To address this problem, there have been numerous research efforts, including both experimental and theoretical studies [6]. The experimental studies usually focus on some empirical properties of the feature extracted by DNNs, by visualizing the feature [7] or testing its performance on specific training settings [8] or learning tasks [9]. Though such empirical methods have provided some intuitive interpretations, the performance can highly depend on the data and network architecture used. While the feature visualization works well on convolutional neural networks, its application to other networks is typically less effective [10]

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Jan 17, 2022
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Information Theoretic Interpretation to Deep Neural Networks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
-
Zanco journal of pure and applied sciences | VOL. 34
--
12 Apr 2022
Zanco journal of pure and applied sciences | VOL. 34

Artificial intelligence in interdisciplinary life science and drug discovery research.
Jürgen Bajorath
Future Science OA | VOL. 8
Jürgen BajorathJürgen Bajorath
08 Mar 2022
Future Science OA | VOL. 8

Deep distributed convolutional neural networks: Universality
Ding-Xuan Zhou
Analysis and Applications | VOL. 16
Ding-Xuan ZhouDing-Xuan Zhou
01 Nov 2018
Analysis and Applications | VOL. 16

Growing random forest on deep convolutional neural networks for scene categorization
Shuang Bai
Expert systems with applications | VOL. 71
Shuang BaiShuang Bai
17 Oct 2016
Expert systems with applications | VOL. 71

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Information Theoretic Interpretation to Deep Neural Networks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy