Roles of pre-training in deep neural networks from information theoretical perspective

Yasutaka Furusho,Takatomi Kubo,Kazushi Ikeda

doi:10.1016/j.neucom.2016.12.083

Roles of pre-training in deep neural networks from information theoretical perspective

Yasutaka Furusho, Takatomi Kubo + Show 1 more

https://doi.org/10.1016/j.neucom.2016.12.083

Copy DOI

Journal: Neurocomputing	Publication Date: Mar 8, 2017
Citations: 9

Affiliation: Nara Institute of Science and Technology

#Layers In Deep Learning #Performance In Pattern Recognition + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Although deep learning shows high performance in pattern recognition and machine learning, the reasons remain unclarified. To tackle this problem, we calculated the information theoretical variables of the representations in the hidden layers and analyzed their relationship to the performance. We found that entropy and mutual information, both of which decrease in a different way as the layer deepens, are related to the generalization errors after fine-tuning. This suggests that the information theoretical variables might be a criterion for determining the number of layers in deep learning without fine-tuning that requires high computational loads.

Full Text