As training a high-performance deep neural network (DNN) model requires a large amount of data, powerful computing resources and expert knowledge, protecting well-trained DNN models from intellectual property (IP) infringement has raised serious concerns in recent years. Most existing methods using DNN watermarks to verify the ownership of the models after IP infringement occurs, which is reactive in the sense that they cannot prevent unauthorized users from using the model in the first place. Different from these methods, in this article, we propose an active authorization control and user’s fingerprint tracking method for the IP protection of DNN models by utilizing sample-specific backdoor attack. The proposed method inversely and multiplely exploits sample-specific trigger as the key to implement authorization control for DNN model, in which the generated triggers are imperceptible and sample-specific for clean images. Specifically, a U-Net model is used to generate backdoor instances. Then, the target model is trained on the clean images and backdoor instances, which are inversely labeled as wrong classes and correct classes, respectively. Only authorized users can use the target model normally by pre-processing the clean images through the U-Net model. Moreover, the images processed by the U-Net model will contain unique fingerprint that can be extracted to verify and track the corresponding user’s identity. This article is the first work that utilizes the sample-specific backdoor attack to implement active authorization control and user’s fingerprint management for DNN model under black-box scenarios. Extensive experimental results on ImageNet dataset and YouTube Aligned Face dataset demonstrate that the proposed method is effective in protecting the DNN model from unauthorized usage. Specifically, the protected model has a low inference accuracy (1.00%) for unauthorized users, while maintaining a normal inference accuracy (97.67%) for authorized users. Besides, the proposed method can achieve 100% fingerprint tracking success rates on both the ImageNet and YouTube Aligned Face datasets. Moreover, it is demonstrated that the proposed method is robust against fine-tuning attack, pruning attack, pruning attack with retraining, reverse-engineering attack, adaptive attack, and JPEG compression attack. The code is available at https://github.com/nuaaaisec/SSAT .
Read full abstract