Abstract

Class incremental learning from a pre-trained DNN model is gaining lots of popularity. Unfortunately, the pre-trained model also introduces a new attack vector, which enables an adversary to inject a backdoor into it and further compromise the downstream models learned from it. Prior works proposed backdoor attacks against the pre-trained models in the transfer learning scenario. However, they become less effective when the adversary does not have the knowledge of the downstream tasks or new data, which is more practical and considered in this paper. To this end, we design the first latent backdoor attacks against incremental learning. We propose two novel techniques, which can effectively and stealthily embed a backdoor into the pre-trained model. Such backdoor can only be activated when the pre-trained model is extended to a downstream model with incremental learning. It has a very high attack success rate, and is able to bypass existing backdoor detection approaches. Extensive experiments confirm the effectiveness of our attacks over different datasets and incremental learning methods, as well as strong robustness against state-of-the-art backdoor defense mechanisms including <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Neural Cleanse</i> , <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Fine-Pruning</i> and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">STRIP</i> .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call