Dual-Targeted Textfooler Attack on Text Classification Systems

Hyun Kwon

doi:10.1109/access.2021.3121366

Abstract

Deep neural networks provide good performance on classification tasks such as those for image, audio, and text classification. However, such neural networks are vulnerable to adversarial examples. An adversarial example is a sample created by adding a small adversarial noise to an original data sample in such a way that it will be correctly classified by a human but misclassified by a deep neural network. Studies on adversarial examples have focused mainly on the image field, but research is expanding into the text field as well. Adversarial examples in the text field that are designed with two targets in mind can be useful in certain situations. In a military scenario, for example, if enemy models A and B use a text recognition model, it may be desirable to cause enemy model A tanks to go to the right and enemy model B self-propelled guns to go to the left by using strategically designed adversarial messages. Such a dual-targeted adversarial example could accomplish this by causing different misclassifications in different models, in contrast to single-target adversarial examples produced by existing methods. In this paper, I propose a method for creating a dual-targeted textual adversarial example for attacking a text classification system. Unlike the existing adversarial methods, which are designed for images, the proposed method creates dual-targeted adversarial examples that will be misclassified as a different class by each of two models while maintaining the meaning and grammar of the original sentence, by substituting words of importance. Experiments were conducted using the SNLI dataset and the TensorFlow library. The results demonstrate that the proposed method can generate a dual-targeted adversarial example with an average attack success rate of 82.2% on the two models.

Highlights

Deep neural networks [1] provide good performance in the fields of image classification [2], speech classification [3], text classification [4], and intrusion detection [5], which involve machine learning tasks
The deep neural network known as the “bidirectional encoder representations from transformers” (BERT) [6] model provides good performance in the text domain
Because the SNLI dataset consists of single-sentence data, the sample size is limited to the length of a single sentence

Summary

INTRODUCTION

Deep neural networks [1] provide good performance in the fields of image classification [2], speech classification [3], text classification [4], and intrusion detection [5], which involve machine learning tasks. Unlike those in the image field, adversarial examples [9] [10] in the text domain consist of text that has the same meaning as the original text but are designed to be misclassified by a targeted model They are created by selecting important words from text sentences and replacing them with other words that are similar. Existing studies [9] [11] on adversarial examples in the text domain have proposed methods of attack that target one model. I propose the dual-targeted textfooler method for attacking text recognition systems This method creates a dual-targeted adversarial example that is designed to be misclassified as a different class by each model when there are two target models.

RELATED WORK

EXPERIMENT AND EVALUATION

Findings

DISCUSSION

CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2023
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Dual-Targeted Textfooler Attack on Text Classification Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Generating watermarked adversarial texts
Mingjie Li ... Hanzhou Wu
Journal of Electronic Imaging | VOL. 32
Mingjie Li, et. al.Mingjie Li ... Hanzhou Wu
28 Mar 2023
Journal of Electronic Imaging | VOL. 32

Fooling a Neural Network in Military Environments: Random Untargeted Adversarial Example
Hyun Kwon ... Yongchul Kim
-
Hyun Kwon, et. al.Hyun Kwon ... Yongchul Kim
01 Oct 2018
01 Oct 2018

Evaluating Defensive Distillation for Defending Text Processing Neural Networks Against Adversarial Examples
Marcus Soll ... Stefan Wermter
-
Marcus Soll, et. al.Marcus Soll ... Stefan Wermter
01 Jan 2019
01 Jan 2019

Generating traceable adversarial text examples by watermarking in the semantic space
Mingjie Li ... Xinpeng Zhang
Journal of Electronic Imaging | VOL. 31
Mingjie Li, et. al.Mingjie Li ... Xinpeng Zhang
26 Nov 2022
Journal of Electronic Imaging | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dual-Targeted Textfooler Attack on Text Classification Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access