A backdoor attack against quantum neural networks with limited information

Chen-Yi Huang,Shi-Bin Zhang

doi:10.1088/1674-1056/acd8ab

Abstract

Backdoor attacks are emerging security threats to deep neural networks. In these attacks, adversaries manipulate the network by constructing training samples embedded with backdoor triggers. The backdoored model performs as expected on clean test samples but consistently misclassifies samples containing the backdoor trigger as a specific target label. While quantum neural networks (QNNs) have shown promise in surpassing their classical counterparts in certain machine learning tasks, they are also susceptible to backdoor attacks. However, current attacks on QNNs are constrained by the adversary’s understanding of the model structure and specific encoding methods. Given the diversity of encoding methods and model structures in QNNs, the effectiveness of such backdoor attacks remains uncertain. In this paper, we propose an algorithm that leverages dataset-based optimization to initiate backdoor attacks. A malicious adversary can embed backdoor triggers into a QNN model by poisoning only a small portion of the data. The victim QNN maintains high accuracy on clean test samples without the trigger but outputs the target label set by the adversary when predicting samples with the trigger. Furthermore, our proposed attack cannot be easily resisted by existing backdoor detection methods.

Full Text