Abstract

The time and monetary costs of training sophisticated deep neural networks are exorbitant, which motivates resource-limited users to outsource the training process to the cloud. Concerning that an untrustworthy cloud service provider may inject backdoors to the returned model, the user can leverage state-of-the-art defense strategies to examine the model. In this paper, we aim to develop robust backdoor attacks (named RobNet) that can evade existing defense strategies from the standpoint of malicious cloud providers. The key rationale is to diversify the triggers and strengthen the model structure so that the backdoor is hard to be detected or removed. To attain this objective, we refine the trigger generation algorithm by selecting the neuron(s) with large weights and activations and then computing the triggers via gradient descent to maximize the value of the selected neuron(s). In stark contrast to existing works that fix the trigger location, we design a multi-location patching method to make the model less sensitive to mild displacement of triggers in real attacks. Furthermore, we extend the attack space by proposing multi-trigger backdoor attacks that can misclassify inputs with different triggers into the same or different target label(s). We evaluate the performance of RobNet on MNIST, GTSRB, and CIFAR-10 datasets, against four representative defense strategies Pruning, NeuralCleanse, Strip, and ABS. The comparison with two state-of-the-art baselines BadNets and Hidden Backdoors demonstrates that RobNet achieves higher attack success rate and is more resistant to potential defenses.

Highlights

  • D EEP neural networks (DNN) have achieved tremendous success in many applications, including autonomous driving [1], voice recognition [2], and image processing [3]

  • Note that 7% falls into the range that can be detected by NeuralCleanse [20], yet we show in Section V that NeuralCleanse is ineffective in defending against ROBNET

  • The invisibility concern of Hidden Backdoor (HB) affects its performance in attack success rate, and we show that HB is detected by three of the four defense strategies that we evaluate even if the trigger is imperceptible by sight

Read more

Summary

INTRODUCTION

D EEP neural networks (DNN) have achieved tremendous success in many applications, including autonomous driving [1], voice recognition [2], and image processing [3]. The cloud returns a backdoored model that potentially breaches the authentication system of the user’s enterprise by misclassifying any person with a special trigger (e.g., a carefully-designed glass) as a legitimate user Such security threats exist in autonomous driving and voice recognition [10], [11]. A defense-aware user may leverage these defense strategies to check the received model from the cloud, and conduct network pruning to remove the backdoor In this circumstance, existing backdoor attacks will fail. Motivated by the above discussion, in this paper, we propose a robust targeted backdoor attack against deep neural networks in the outsourced cloud environment, named ROBNET. We validate the effectiveness of ROBNET (including single-trigger and multi-trigger attacks) with extensive experiments on various deep neural networks and various state-of-the-art defense strategies. Experiment results confirm that ROBNET achieves high attack success rate and is resistant to defense strategies

Deep Neural Networks
Outsource Training of Machine Learning Models
Backdoor Attacks and Defenses
THREAT MODEL
Overview
Step I
Step II
Multi-Trigger Attacks
Experiment Setup
Evaluation Results
Backdoor Attacks
Backdoor Defenses
Adversarial Examples
Trigger Visibility
Model Transfer
Physical World
Potential Defense
Findings
VIII. CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call