Reading is not believing: A multimodal adversarial attacker for Chinese-NLP model

Zhaocheng Ge,Hanping Hu,Tengfei Zhao,Dingmeng Shi

doi:10.1016/j.cose.2022.103052

Abstract

The research of adversarial examples has extended from image to text in the last few years. However, these attacks are typically limited to the English language and simple substitution strategies. To further expose the vulnerability of NLP models, we study the linguistic characteristics of Chinese, the quintessential ideogram with over 1.2 billion native speakers. Accordingly, a novel attack framework named ZH-Deceiver is proposed to generate Chinese adversarial examples from the perspective of morphology, phonetics, semantics, and basic transformation. In particular, a CNN-based Siamese Network is integrated to ameliorate the quality of adversarial examples. To elaborate the validity of ZH-Deceiver, extensive experiments are conducted on two datasets. Compared with four benchmarks such as Genetic, PWWS, TextBugger, and SememePSO, our attack achieves impressive performance on effectiveness, efficiency, imperceptibility, and human evaluation by deceiving seven AI models including CNN and BERT. Furthermore, the transferability, as well as the robustness, is further analyzed and the former is successfully applied to attack three commercial APIs: Tencent, ALi, and Baidu. ZH-Deceiver acts as a wake-up call for multilingual processing models, and tangibly extends the application and methodology of adversarial textual attack.

Full Text