Abstract

Recently, considerable progress has been made in providing solutions to prevent intellectual property (IP) theft for deep neural networks (DNNs) in ideal classification or recognition scenarios. However, little work has been dedicated to protecting the IP of DNN models in the context of transfer learning. Moreover, knowledge transfer is usually achieved through knowledge distillation or cross-domain distribution adaptation techniques, which will easily lead to the failure of the IP protection due to the high risk of the underlying DNN watermark being corrupted. To address this issue, we propose a subnetwork-lossless robust DNN watermarking (SRDW) framework, which can exploit out-of-distribution (OOD) guidance data augmentation to boost the robustness of watermarking. Specifically, we accurately seek the most rational modification structure (i.e., core subnetwork) using the module risk minimization, and then calculate the contrastive alignment error and the corresponding hash value as the reversible compensation information for the restoration of carrier network. Experimental results show that our scheme has superior robustness against various hostile attacks, such as fine-tuning, pruning, cross-domain matching, and overwriting. In the absence of malicious jamming attacks, the core subnetwork can be recovered without any loss. Besides that, we investigate how embedding watermarks in batch normalization (BN) layers affect the generalization performance of the deep transfer learning models, which reveals that reducing the embedding modifications in BN layers can further promote the robustness to resist hostile attacks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call