Abstract

Recently, several adversarial training methods have been proposed for rejecting perturbation-based adversarial examples, which enhance the robustness of deep neural networks to larger perturbations. However, they often perform unsatisfactorily when dealing with examples with lower level perturbations. To address these issues, we introduce a novel adversarial training approach called the union label smoothing adversarial training (ULSAT), which employs a new label smoothing curve and a union strategy for adversarial training. The label smoothing curve assigns soft labels to perturbed examples, allowing for a more reasonable adjustment of calibration. The union strategy adds interpolation examples and combines adversarial examples generated under different perturbation tolerances into the training stage, which improves the rejection ability of the model and balances it with the classification ability. Through theoretical analysis and ablation study, we demonstrate the effectiveness of our proposed approach. Numerical experiments show that ULSAT can accurately classifies less disturbed examples while maintaining a good rejection ability for adversarial examples with higher levels of perturbation. Moreover, we introduce an evaluation index that comprehensively considers the classification ability and rejection ability of the model. Under this index, ULSAT achieves state-of-the-art results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call