Adversarial Machine Learning in Text Processing: A Literature Survey

Izzat Alsmadi,Mohammed Abdulaziz Al-Naeem,Adel Aldalbahi,Ahmad H Sawalmeh,Mahmoud Nazzal,Conrado P Vizcarra,Nura Aljaafari,Abdulaziz Al-Humam,Abdallah Khreishah,Muhammad Anan,Shadan Alhamed,Abdulelah Algosaibi

doi:10.1109/access.2022.3146405

Abstract

Machine learning algorithms represent the intelligence that controls many information systems and applications around us. As such, they are targeted by attackers to impact their decisions. Text created by machine learning algorithms has many types of applications, some of which can be considered malicious especially if there is an intention to present machine-generated text as human-generated. In this paper, we surveyed major subjects in adversarial machine learning for text processing applications. Unlike adversarial machine learning in images, text problems and applications are heterogeneous. Thus, each problem can have its own challenges. We focused on some of the evolving research areas such as: malicious versus genuine text generation metrics, defense against adversarial attacks, and text generation models and algorithms. Our study showed that as applications of text generation will continue to grow in the near future, the type and nature of attacks on those applications and their machine learning algorithms will continue to grow as well. Literature survey indicated an increasing trend in using pre-trained models in machine learning. Word/sentence embedding models and transformers are examples of those pre-trained models. Adversarial models may utilize same or similar pre-trained models as well. In another trend related to text generation models, literature showed effort to develop universal text perturbations to be used in both black-and white-box attack settings. Literature showed also using conditional GANs to create latent representation for writing types. This usage will allow for a seamless lexical and grammatical transition between various writing styles. In text generation metrics, research trends showed developing successful automated or semi-automated assessment metrics that may include human judgement. Literature showed also research trends of designing and developing new memory models that increase performance and memory utilization efficiency without validating real-time constraints. Many research efforts evaluate different defense model approaches and algorithms. Researchers evaluated different types of targeted attacks, and methods to distinguish human versus machine generated text.

Highlights

There are strong indicators that machine learning (ML) models are amenable to attacks that involve modification of input data intending to cause models’ target misclassification
We evaluate research progresses and trends in adversarial machine learning (AML) text progressing focusing on several subjects including proposed defense mechanisms, text generation models, and metrics
SUMMARY AND RESEARCH TRENDS In this paper, recent literature in adversarial machine learning for text generation tasks is summarized

Summary

Introduction

There are strong indicators that machine learning (ML) models are amenable to attacks that involve modification of input data intending to cause models’ target misclassification. The main reason behind such vulnerability is the adaptive nature of ML models. We can think of an ML model or system that takes and processes some inputs, which results in an initial accurate classification based on original data inputs. Still, it is eventually possible for some malicious actor to artificially construct an input that will result in incorrectly classified instances when tested by the machine learner. It is eventually possible for some malicious actor to artificially construct an input that will result in incorrectly classified instances when tested by the machine learner Generating such artificial input data or adversarial

Objectives

Methods

Findings

Conclusion