Textual Backdoor Attack for the Text Classification System

Hyun Kwon,Sanghyun Lee,Junggab Son

doi:10.1155/2021/2938386

Abstract

Deep neural networks provide good performance for image recognition, speech recognition, text recognition, and pattern recognition. However, such networks are vulnerable to backdoor attacks. In a backdoor attack, normal data that do not include a specific trigger are correctly classified by the target model, but backdoor data that include the trigger are incorrectly classified by the target model. One advantage of a backdoor attack is that the attacker can use a specific trigger to attack at a desired time. In this study, we propose a backdoor attack targeting the BERT model, which is a classification system designed for use in the text domain. Under the proposed method, the model is additionally trained on a backdoor sentence that includes a specific trigger, and afterward, if the trigger is attached before or after an original sentence, it will be misclassified by the model. In our experimental evaluation, we used two movie review datasets (MR and IMDB). The results show that using the trigger word “ATTACK” at the beginning of an original sentence, the proposed backdoor method had a 100% attack success rate when approximately 1.0% and 0.9% of the training data consisted of backdoor samples, and it allowed the model to maintain an accuracy of 86.88% and 90.80% on the original samples in the MR and IMDB datasets, respectively.

Highlights

Deep neural networks [1] provide good performance for image [2], voice [3], text [4], and pattern analysis [5]
Poisoning attacks [11] and backdoor attacks [12,13,14] are typical examples of causative attacks. e exploratory attack is more practical because it does not require the addition of training data as does the causative attack, but it has the disadvantage of involving the real-time manipulation of test data
The model is trained on a backdoor sentence that includes a specific trigger, and afterward, if the trigger is attached before or after an original sentence, it will be misclassified by the model. e contributions of this study are as follows

Summary

Introduction

Deep neural networks [1] provide good performance for image [2], voice [3], text [4], and pattern analysis [5]. There are security vulnerabilities in such networks. Barreno et al [6] divided these vulnerabilities into the risk from exploratory attacks and that from causative attacks. An exploratory attack induces misclassification by manipulating the test data of a deep neural network that has already been trained. A typical example of an exploratory attack is an adversarial example [7,8,9,10]. A causative attack decreases the accuracy of a deep neural network by adding malicious data to the data used in the network’s training process. E exploratory attack is more practical because it does not require the addition of training data as does the causative attack, but it has the disadvantage of involving the real-time manipulation of test data Poisoning attacks [11] and backdoor attacks [12,13,14] are typical examples of causative attacks. e exploratory attack is more practical because it does not require the addition of training data as does the causative attack, but it has the disadvantage of involving the real-time manipulation of test data

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Security and Communication Networks	Publication Date: Oct 22, 2021
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Textual Backdoor Attack for the Text Classification System

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks

Lead the way for us

Similar Papers

Textual Adversarial Training of Machine Learning Model for Resistance to Adversarial Examples
Hyun Kwon ... Sanghyun Lee
Security and Communication Networks | VOL. 2022
Hyun Kwon, et. al.Hyun Kwon ... Sanghyun Lee
07 Apr 2022
Security and Communication Networks | VOL. 2022

Textual Backdoor Defense via Poisoned Sample Recognition
Kun Shao ...
Applied Sciences | VOL. 11
Kun Shao, et. al.Kun Shao ...
25 Oct 2021
Applied Sciences | VOL. 11

Ensemble transfer attack targeting text classification systems
Hyun Kwon ... Sanghyun Lee
Computers & Security | VOL. 117
Hyun Kwon, et. al.Hyun Kwon ... Sanghyun Lee
17 Mar 2022
Computers & Security | VOL. 117

TargetNet Backdoor
Hyun Kwon ... Hyunsoo Yoon
-
Hyun Kwon, et. al.Hyun Kwon ... Hyunsoo Yoon
19 Feb 2020
19 Feb 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Textual Backdoor Attack for the Text Classification System

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks