In recent years, deep learning models have been widely deployed in various application scenarios. The training processes of deep neural network (DNN) models are time-consuming, and require massive training data and large hardware overhead. These issues have led to the outsourced training procedure, pre-trained models supplied from third parties, or massive training data from untrusted users. However, a few recent researches indicate that, by injecting some well-designed backdoor instances into the training set, the attackers can create a concealed backdoor in the DNN model. In this way, the attacked model still works normally on the benign inputs, but when a backdoor instance is submitted, some specific abnormal behaviors will be triggered. Existing studies all focus on attacking a single target that triggered by a single backdoor (referred to as One-to-One attack), while the backdoor attacks against multiple target classes, and backdoor attacks triggered by multiple backdoors have not been studied yet. In this article, for the first time, we propose two advanced backdoor attacks, the multi-target backdoor attacks and multi-trigger backdoor attacks: 1) One-to-N attack, where the attacker can trigger multiple backdoor targets by controlling the different intensities of the same backdoor; 2) N-to-One attack, where such attack is triggered only when all the <inline-formula><tex-math notation="LaTeX">$N$</tex-math><alternatives><mml:math><mml:mi>N</mml:mi></mml:math><inline-graphic xlink:href="xue-ieq1-3028448.gif"/></alternatives></inline-formula> backdoors are satisfied. Compared with existing One-to-One attacks, the proposed two backdoor attacks are more flexible, more powerful and more difficult to be detected. Besides, the proposed backdoor attacks can be applied under the weak attack model, where the attacker has no knowledge about the parameters and architectures of the DNN models. Experimental results show that these two attacks can achieve better or similar performances when injecting a much smaller proportion or same proportion of backdoor instances than those existing One-to-One backdoor attacks. The two attack methods can achieve high attack success rates (up to 100 percent in MNIST dataset and 92.22 percent in CIFAR-10 dataset), while the test accuracy of the DNN model has hardly dropped (as low as 0 percent in LeNet-5 model and 0.76 percent in VGG-16 model), thus will not raise administrator’s suspicions. Further, the two attacks are also evaluated on a large and realistic dataset (Youtube Aligned Face dataset), where the maximum attack success rate reaches 90 percent (One-to-N) and 94 percent (N-to-One), and the accuracy degradation of target face recognition model (VGGFace model) is only 0.05 percent. The proposed One-to-N and N-to-One attacks are demonstrated to be effective and stealthy against two state-of-the-art defense methods.
Read full abstract