Backdoor Attacks on Deep Neural Networks via Transfer Learning from Natural Images

Yuki Matsuo,Kazuhiro Takemoto

doi:10.3390/app122412564

Abstract

Backdoor attacks are a serious security threat to open-source and outsourced development of computational systems based on deep neural networks (DNNs). In particular, the transferability of backdoors is remarkable; that is, they can remain effective after transfer learning is performed. Given that transfer learning from natural images is widely used in real-world applications, the question of whether backdoors can be transferred from neural models pretrained on natural images involves considerable security implications. However, this topic has not been evaluated rigorously in prior studies. Hence, in this study, we configured backdoors in 10 representative DNN models pretrained on a natural image dataset, and then fine-tuned the backdoored models via transfer learning for four real-world applications, including pneumonia classification from chest X-ray images, emergency response monitoring from aerial images, facial recognition, and age classification from images of faces. Our experimental results show that the backdoors generally remained effective after transfer learning from natural images, except for small DNN models. Moreover, the backdoors were difficult to detect using a common method. Our findings indicate that backdoor attacks can exhibit remarkable transferability in more realistic transfer learning processes, and highlight the need for the development of more advanced security countermeasures in developing systems using DNN models for sensitive or mission-critical applications.

Full Text