Abstract

High cost of training time caused by multi-step adversarial example generation is a major challenge in adversarial training. Previous methods try to reduce the computational burden of adversarial training using single-step adversarial example generation schemes, which can effectively improve the efficiency but also introduce the problem of “catastrophic overfitting”, where the robust accuracy against Fast Gradient Sign Method (FGSM) can achieve nearby 100% whereas the robust accuracy against Projected Gradient Descent (PGD) suddenly drops to 0% over a single epoch. To address this issue, we focus on single-step adversarial training scheme in this paper and propose a novel Fast Gradient Sign Method with PGD Regularization (FGSMPR) to boost the efficiency of adversarial training without catastrophic overfitting. Our core observation is that single-step adversarial training can not simultaneously learn robust internal representations of FGSM and PGD adversarial examples. Therefore, we design a PGD regularization term to encourage similar embeddings of FGSM and PGD adversarial examples. The experiments demonstrate that our proposed method can train a robust deep network for \(L_{\infty }\)-perturbations with FGSM adversarial training and reduce the gap to multi-step adversarial training.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call