Abstract

Several studies have shown that deep neural networks (DNNs) are vulnerable to adversarial examples - perturbed inputs that cause DNN-based models to produce incorrect outputs. A variety of adversarial attacks have been proposed in the domains of computer vision and natural language processing (NLP); however, most attacks in the NLP domain have been applied to DNNs that were trained on English corpora. This paper proposes the first set of black-box adversarial attacks designed to perturb Arabic textual inputs. By intentionally violating the noun-adjective agreement in Arabic, two state-of-the-art DNN architectures are successfully fooled in the task of sentiment analysis, and classification accuracy was reduced by an average of 52.97% for the word-level BiLSTM model and 50.44% for the word-level CNN model. We believe that our findings will encourage other researchers to investigate the robustness of DNNs when applied to natural languages beyond English.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call