Abstract

Adversarial training is validated to be the most effective method to defend against adversarial attacks. In adversarial training, stronger capacity networks can achieve higher robustness. Mutual learning is plugged into adversarial training to increase robustness by improving model capacity. Specifically, two deep neural networks (DNNs) are trained together with two adversarial examples. Each DNN’s prediction not only fits the right label but also aligns with the other DNN’s prediction. To take full advantage of mutual learning, each DNN needs to learn more extra information about different incorrect class from the other. To achieve it, we propose diverse-label attack to help with training. Concretely, we generate two adversarial examples for the two DNNs making DNNs predict not only incorrectly but also differently. Combining the above two stages, we propose a novel adversarial training method called mutual diverse-label adversarial training (MDLAT). Experiments on CIFAR-10 and CIFAR-100 indicate that our method is effective in improving model robustness under different settings, and our method achieves state-of-the-art (SOTA) robustness under $$\ell _{\infty }$$ attack.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call