Abstract
Deep neural networks (DNNs) have progressed rapidly during the past decade and have been deployed in various real-world applications. Meanwhile, DNN models have been shown to be vulnerable to security and privacy attacks. One such attack that has attracted a great deal of attention recently is the backdoor attack. Specifically, the adversary poisons the target model's training set to mislead any input with an added secret trigger to a target class. Previous backdoor attacks predominantly focus on computer vision (CV) applications, such as image classification. In this paper, we perform a systematic investigation of backdoor attack on NLP models, and propose BadNL, a general NLP backdoor attack framework including novel attack methods. Specifically, we propose three methods to construct triggers, namely BadChar, BadWord, and BadSentence, including basic and semantic-preserving variants. Our attacks achieve an almost perfect attack success rate with a negligible effect on the original model's utility. For instance, using the BadChar, our backdoor attack achieves a 98.9% attack success rate with yielding a utility improvement of 1.5% on the SST-5 dataset when only poisoning 3% of the original set. Moreover, we conduct a user study to prove that our triggers can well preserve the semantics from humans perspective.
Full Text
Topics from this Paper
Backdoor Attack
NLP Models
Model's Training Set
Attack Success Rate
Deep Neural Networks Models
+ Show 5 more
Create a personalized feed of these topics
Get StartedTalk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
IEEE Transactions on Dependable and Secure Computing
May 1, 2022
Oct 1, 2021
Computers & Security
Jul 1, 2022
Jan 1, 2021
IEEE Transactions on Dependable and Secure Computing
Jan 1, 2020
IEEE Open Journal of the Computer Society
Jan 1, 2023
Sensors
May 14, 2023
Applied Sciences
Dec 8, 2022
Sep 5, 2022
Applied Intelligence
Apr 12, 2023
May 23, 2022
Computers & Security
Nov 1, 2021
Mar 22, 2023
Computers, Materials & Continua
Jan 1, 2022