Abstract

Dropout is an effective regularization method for deep learning tasks. Several variants of dropout based on sampling with different distributions have been proposed individually and have shown good generalization performance on various learning tasks. Among these variants, the canonical Bernoulli dropout is a discrete method, while the uniform dropout and the Gaussian dropout are continuous dropout methods. When facing a new learning task, one must make a decision on which method is more suitable, which is somehow unnatural and inconvenient. In this paper, we attempt to change the selection problem to a parameter tuning problem by proposing a general form of dropout, $\beta $ -dropout, to unify the discrete dropout with continuous dropout. We show that by adjusting the shape parameter $\beta $ , the $\beta $ -dropout can yield the Bernoulli dropout, uniform dropout, and approximate Gaussian dropout. Furthermore, it can obtain continuous regularization strength, which paves the way for self-adaptive dropout regularization. As a first attempt, we propose a self-adaptive $\beta $ -dropout, in which the parameter $\beta $ is tuned automatically following a pre-designed strategy. The $\beta $ -dropout is tested extensively on the MNIST, CIFAR-10, SVHN, NORB, and ILSVRC-12 datasets to investigate its superior performance. The results show that the $\beta $ -dropout can conduct finer control of its regularization strength, therefore obtaining better performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call