Adversarially robust neural networks with feature uncertainty learning and label embedding

Ran Wang,Haopeng Ke,Meng Hu,Wenhui Wu

doi:10.1016/j.neunet.2023.12.041

Abstract

Deep neural networks (DNNs) are vulnerable to the attacks of adversarial examples, which bring serious security risks to the learning systems. In this paper, we propose a new defense method to improve the adversarial robustness of DNNs based on stochastic neural networks (SNNs), termed as Margin-SNN. The proposed Margin-SNN mainly includes two modules, i.e., feature uncertainty learning module and label embedding module. The first module introduces uncertainty to the latent feature space by giving each sample a distributional representation rather than a fixed point representation, and leverages the advantages of variational information bottleneck method in achieving good intra-class compactness in latent space. The second module develops a label embedding mechanism to take advantage of the semantic information underlying the labels, which maps the labels into the same latent space with the features, in order to capture the similarity between sample and its class centroid, where a penalty term is equipped to elegantly enlarge the margin between different classes for better inter-class separability. Since no adversarial information is introduced, the proposed model can be learned in standard training to improve adversarial robustness, which is much more efficient than adversarial training. Extensive experiments on data sets MNIST, FASHION MNIST, CIFAR10, CIFAR100 and SVHN demonstrate superior defensive ability of the proposed method. Our code is available at https://github.com/humeng24/Margin-SNN.

Full Text