Mutual information-based dropout: Learning deep relevant feature representation architectures

Jie Chen,Zhongcheng Wu,Jun Zhang,Fang Li

doi:10.1016/j.neucom.2019.04.090

Abstract

We propose a new regularization strategy called DropMI, which is a generalization of Dropout for the regularization of networks that introduces mutual information (MI) dynamic analysis. The standard Dropout randomly drops a certain proportion of neural units, according to the Bernoulli distribution, thereby resulting in the loss of some important hidden feature information. In DropMI, we first evaluate the importance of each neural unit in the feature representation of the hidden layer based on the MI between it and the target. We then construct a new binary mask matrix based on the sorting distribution of MI, thus developing a dynamic DropMI strategy that highlights the important neural units that are beneficial to the feature representation. The results from the MNIST, NORB, CIFAR-10, CIFAR-100, SVHN, and Multi-PIE datasets indicate that, relative to other state-of-the-art regularization methods based on the benchmark autoencoder and convolutional neural networks, our method has better feature representation performance and effectively reduces the overfitting of the model.

Full Text