Dissecting Deep Learning Networks-Visualizing Mutual Information.

Hui Fang,Motonori Yamaguchi,Victoria Wang

doi:10.3390/e20110823

Hui Fang, Motonori Yamaguchi + Show 1 more

Open Access

https://doi.org/10.3390/e20110823

Copy DOI

Abstract

Deep Learning (DL) networks are recent revolutionary developments in artificial intelligence research. Typical networks are stacked by groups of layers that are further composed of many convolutional kernels or neurons. In network design, many hyper-parameters need to be defined heuristically before training in order to achieve high cross-validation accuracies. However, accuracy evaluation from the output layer alone is not sufficient to specify the roles of the hidden units in associated networks. This results in a significant knowledge gap between DL’s wider applications and its limited theoretical understanding. To narrow the knowledge gap, our study explores visualization techniques to illustrate the mutual information (MI) in DL networks. The MI is a theoretical measurement, reflecting the relationship between two sets of random variables even if their relationship is highly non-linear and hidden in high-dimensional data. Our study aims to understand the roles of DL units in classification performance of the networks. Via a series of experiments using several popular DL networks, it shows that the visualization of MI and its change patterns between the input/output with the hidden layers and basic units can facilitate a better understanding of these DL units’ roles. Our investigation on network convergence suggests a more objective manner to potentially evaluate DL networks. Furthermore, the visualization provides a useful tool to gain insights into the network performance, and thus to potentially facilitate the design of better network architectures by identifying redundancy and less-effective network units.

Highlights

Deep Learning (DL) is a powerful neural network technique, which has been one of the most revolutionary developments in artificial intelligence (AI) research in this decade, e.g., [1,2,3,4,5,6]
Reference [10] designed a dynamic learning rate for better convergence based on the MI; Reference [11] added a shortcut in an multi-layer perceptron (MLP) network to improve the learning efficiency of the network; and Reference [12] used mutual information neural estimation (MINE) to explicitly maximize the mutual information between input data and learned high-level representations
Whilst in most of the references, the results are only demonstrated by using MLP networks; (ii) instead of only calculating the MI between layers, we consider convolutional kernels as individual units and visualize the MI between the kernels and the output labels in order to illustrate the roles and the evolutional patterns of the kernels during training; (iii) we show that the MI is used to guide the visualization of response maps extracted from convolutional kernels with visual inspection, which helps to identify redundancy and less effective kernels and allows better transfer of learning; and (iv) we demonstrate that the MI visualization tools can be used to select effective hyper-parameters, such as layer structure, stride length, and number of epochs

Summary

Introduction

Deep Learning (DL) is a powerful neural network technique, which has been one of the most revolutionary developments in artificial intelligence (AI) research in this decade, e.g., [1,2,3,4,5,6]. Reference [10] designed a dynamic learning rate for better convergence based on the MI; Reference [11] added a shortcut in an MLP network to improve the learning efficiency of the network; and Reference [12] used mutual information neural estimation (MINE) to explicitly maximize the mutual information between input data and learned high-level representations. (i) we extend the information plane analysis from fully connected neural networks to the convolutional neural networks that include the LeNet and the DenseNet. Whilst in most of the references, the results are only demonstrated by using MLP networks; (ii) instead of only calculating the MI between layers, we consider convolutional kernels as individual units and visualize the MI between the kernels and the output labels in order to illustrate the roles and the evolutional patterns of the kernels during training;.

Deep Learning Networks

Information Theory

Deep Learning Analysis via Visualization

Mutual Information Estimation and Information Plane for DL Network Analysis

MI of Deep

Information

CNN Kernel Analysis via MI Visualization

Visualizing

Further Analysis and Discussions

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Oct 26, 2018
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Dissecting Deep Learning Networks-Visualizing Mutual Information.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Utilizing Fingerphotos with Deep Learning Techniques to Recognize Individuals
Raid Rafi Omar Al-Nima ... Sahar Esmaiel Mahmood
NTU Journal of Engineering and Technology | VOL. 2
Raid Rafi Omar Al-Nima, et. al.Raid Rafi Omar Al-Nima ... Sahar Esmaiel Mahmood
04 Apr 2023
NTU Journal of Engineering and Technology | VOL. 2

Combined deep belief network in deep learning with affinity propagation clustering algorithm for roller bearings fault diagnosis without data label
Fan Xu ... Peter W Tse
Journal of Vibration and Control | VOL. 25
Fan Xu, et. al.Fan Xu ... Peter W Tse
04 Jul 2018
Journal of Vibration and Control | VOL. 25

Trends in the application of deep learning networks in medical image analysis: Evolution between 2012 and 2020
Lu Wang ... Fan Li
European Journal of Radiology | VOL. 146
Lu Wang, et. al.Lu Wang ... Fan Li
24 Nov 2021
Trends in the application of deep learning networks in medical image analysis: Evolution between 2012 and 2020
Lu Wang ... Fan Li

Deep ensemble models for speech emotion classification
Sheena Christabel Pravin ... J Saranya
Microprocessors and Microsystems | VOL. 98
Sheena Christabel Pravin, et. al.Sheena Christabel Pravin ... J Saranya
15 Feb 2023
Microprocessors and Microsystems | VOL. 98

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dissecting Deep Learning Networks-Visualizing Mutual Information.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy