Abstract
Deep neural networks (DNNs) have been recently shown to be vulnerable to backdoor attacks. The infected model performs well on benign testing samples, however, the attacker can trigger the infected model to misbehave by the backdoor. In the field of natural language processing (NLP), some backdoor attack methods have been proposed, and achieved high attack success rates on a variety of popular models. However, researches on the defense of textual backdoor attacks are lacking and the defense effects are bad at present. In this paper, we propose an effective textual backdoor defense model, namely BDDR, which contains two steps: (1) detecting suspicious words in the sample and (2) reconstructing the original text by deletion or replacement. In the replacement part, we use the pre-trained masking language model taking BERT as an example to generate replacement words. We conduct exhaustive experiments to evaluate our proposed defense model by defending against various backdoor attacks on two infected models trained using two benchmark datasets. Overall, BDDR reduces the attack success rate of word-level backdoor attacks by more than 90%, and reduces the attack success rate of sentence-level backdoor attacks by more than 60%. The experimental results show that our proposed method can always significantly reduce the attack success rate compared with the baseline method.
Full Text
Topics from this Paper
Backdoor Attacks
Attack Success Rate
Field Of Natural Language Processing
Infected Model
Rate Of Attacks
+ Show 5 more
Create a personalized feed of these topics
Get StartedTalk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Oct 1, 2021
Computers & Security
Jul 1, 2022
Applied Sciences
Oct 25, 2021
Applied Intelligence
Apr 12, 2023
Computers, Materials & Continua
Jan 1, 2022
IEEE Transactions on Dependable and Secure Computing
May 1, 2022
Jan 1, 2021
Mar 22, 2023
May 1, 2022
IEEE Transactions on Dependable and Secure Computing
Nov 1, 2023
Sep 5, 2022
Dec 6, 2021
Sensors
May 14, 2023
Computers & Security
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023
Computers & Security
Dec 1, 2023