Token-modification adversarial attacks for natural language processing: A survey

Tom Roth,Yansong Gao,Alsharif Abuadbba,Surya Nepal,Wei Liu

doi:10.3233/aic-230279

Token-modification adversarial attacks for natural language processing: A survey

Tom Roth, Yansong Gao

Open Access

https://doi.org/10.3233/aic-230279

Copy DOI

Journal: AI Communications

Publication Date: Apr 2, 2024

#Adversarial Attacks #Goal Function + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Many adversarial attacks target natural language processing systems, most of which succeed through modifying the individual tokens of a document. Despite the apparent uniqueness of each of these attacks, fundamentally they are simply a distinct configuration of four components: a goal function, allowable transformations, a search method, and constraints. In this survey, we systematically present the different components used throughout the literature, using an attack-independent framework which allows for easy comparison and categorisation of components. Our work aims to serve as a comprehensive guide for newcomers to the field and to spark targeted research into refining the individual attack components.

Full Text