Abstract

Deep neural networks provide end-to-end tools to learn effective representations from data directly. The deep structure makes it possible to model a complicated pattern, even if it has a variety of changes. This leads to a problem that noises and outliers are usually treated as a specific pattern, which is also learned in the network. It is one reason for overfitting incapable of being addressed sufficiently in deep networks. This paper proposes a new method called subdomain contraction (SDC) to tackle the problem. The idea is that our approach inclines to learn more about the shared features between the subsets of the samples but less about the specific features found in only one or two subsets. To this end, the SDC loss penalizes the distribution distance between sub-domains in the feature space to constrain the sharing level of features. By applying the SDC loss term, the data drive the learning process to an optimal tradeoff between modeling noises and the varieties of the pattern. In this manner, the SDC models the pattern as much as possible and ignores most noises, thus improving the generalization ability. The SDC loss can be efficiently computed in minibatches and can also work collaboratively with other regularization methods such as dropout to further improve the performance. Extensive experiments demonstrate that SDC can improve the effectiveness and robustness of representation learning in deep networks against noises, and the superiority is most remarkable with noisy data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call