Abstract

Deep neural networks (DNNs) are known to be vulnerable to adversarial examples which contain human-imperceptible perturbations. A series of defending methods, either proactive defence or reactive defence, have been proposed in the recent years. However, most of the methods can only handle specific attacks. For example, proactive defending methods are invalid against grey-box or white-box attacks, while reactive defending methods are challenged by low-distortion adversarial examples or transferring adversarial examples. This becomes a critical problem since a defender usually does not have the type of attack as <i>a priori</i> knowledge. Moreover, existing two-pronged defences (e.g., MagNet), which take advantage of both proactive and reactive methods, have been reported as broken under transferring attacks. To address this problem, this article proposed a novel defensive framework based on collaborative multi-task training, aiming at providing defence for different types of attacks. The proposed defence first encodes training labels into label pairs and counters black-box attacks leveraging adversarial training supervised by the encoded label pairs. The defence further constructs a detector to identify and reject high-confidence adversarial examples that bypass the black-box defence. In addition, the proposed collaborative architecture can prevent adversaries from finding valid adversarial examples when the defence strategy is exposed. In the experiments, we evaluated our defence against four state-of-the-art attacks on <inline-formula><tex-math notation="LaTeX">$MNIST$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$CIFAR10$</tex-math></inline-formula> datasets. The results showed that our defending method achieved up to 96.3 percent classification accuracy on black-box adversarial examples, and detected up to 98.7 percent of the high confidence adversarial examples. It only decreased the model accuracy on benign example classification by 2.1 percent for the <inline-formula><tex-math notation="LaTeX">$CIFAR10$</tex-math></inline-formula> dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call