Abstract

Deep learning technologies have achieved remarkable success in various tasks, ranging from computer vision, object detection to natural language processing. Unfortunately, state-of-the-art deep learning technologies are vulnerable to adversarial examples and backdoor attacks, where an adversary destroys the model’s integrity. The obstacles have urged intensive research on improving the ability of deep learning technologies to resist integrity attacks. However, existing defense methods are either incomplete (i.e., only a single attack can be detected) or expensive computing resources. It requires the defense method to have universal property, which can effectively and efficiently detect multiple integrity attacks. To this end, we propose a similarity-based integrity protection method for deep learning systems (IPDLS), which is provided with the universal property. IPDLS realizes anomaly detection by measuring the similarity between suspicious samples and samples in a preset verification set. We empirically evaluate IPDLS on the MNIST and CIFAR10 datasets. Experimental results have verified the effectiveness of IPDLS, which can detect adversarial examples and backdoor attacks simultaneously.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call