A semantic backdoor attack against graph convolutional networks

Jiazhu Dai,Zhipeng Xiong,Chenhong Cao

doi:10.1016/j.neucom.2024.128133

Abstract

Graph convolutional networks (GCNs) have been very effective in addressing the issue of various graph-structured related tasks, such as node classification and graph classification. However, recent research has shown that GCNs are vulnerable to a new type of threat called a backdoor attack, where the adversary can inject a hidden backdoor into GCNs so that the attacked model performs well on benign samples, but its prediction will be maliciously changed to the attacker-specified target class if the hidden backdoor is activated by the attacker-defined trigger. A semantic backdoor attack is a new type of backdoor attack on deep neural networks (DNNs), where a naturally occurring semantic feature of samples can serve as a backdoor trigger such that the infected DNNs models will misclassify testing samples containing the predefined semantic feature even without the requirement of modifying the testing samples. Since the backdoor trigger is a naturally occurring semantic feature of the samples, semantic backdoor attacks are more imperceptible and pose a new and serious threat. Existing research on semantic backdoor attacks focuses on the tasks of CNNs-based (Convolutional Neural Networks) image classification and LSTM-based (Long Short-Term Memory) text classification or word prediction. Little attention has been given to semantic backdoor attacks on GCNs models.In this paper, we investigate whether such semantic backdoor attacks are possible for GCNs and propose a semantic backdoor attack against GCNs (SBAG) under the context of graph classification to reveal the existence of this security vulnerability in GCNs. SBAG uses a certain type of nodes in the samples as a backdoor trigger and injects a hidden backdoor into GCNs models by poisoning training data. The backdoor will be activated, and the GCNs models will give malicious classification results specified by the attacker even on unmodified samples as long as the samples contain enough trigger nodes. We evaluate SBAG on five graph datasets. The experimental results indicate that SBAG can achieve attack success rates of approximately 99.9% on unmodified testing samples that naturally contain the trigger and attack success rates over 82% on testing samples modified to inject the trigger, respectively, both under poisoning rates of less than 5%. 11Our code is available at: github.

Full Text