Abstract

Adversarial examples (AEs) pose a significant threat to the security and reliability of deep neural networks. Adversarial training (AT) is one of the effective defense methods, involving the integration of a number of generated AEs into the training process to enhance model robustness. However, the computational cost associated with AE generation is unbearable, particularly for large-scale tasks. In pursuit of fast AT, many algorithms generate AEs by adopting a simple attack strategy, but they often sacrifice the quality of AEs and suffer from catastrophic overfitting, resulting in suboptimal model robustness. To address these issues, our approach incorporates multi-fidelity optimization, which employs a dynamic attack strategy to generate AEs with varying fidelity within a suitable range. Furthermore, we introduce a surrogate-assisted fidelity estimation module at the beginning of our proposed algorithm, allowing for the adaptive determination of the fidelity range tailored to specific tasks. Comparative experiments with seven state-of-the-art algorithms on three networks and three datasets demonstrate that the proposed algorithm obtains a competitive robust accuracy but spends only 50% of the training time of the projected gradient descent algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call