Abstract

With the growing value and high training cost of deep neural network (DNN) models, there has been increasing research interest in deep neural network model watermarking. However, the existing literature lacks robust model watermarking methods with retrospective functionality, which hinders multi-user distribution and fine-grained copyright verification of deep neural network models. To achieve the model watermarking with retrospective functionality that is generally denoted the model fingerprinting, this study proposes a novel robust black-box fingerprinting scheme for deep classification neural networks, in which a fingerprint that is unique for each user is inserted in the given DNN. Specifically, a DWT-DCT-SVD-based embedding method is introduced to construct a high-quality poisoned image by spreading the fingerprint as a poisoned trigger to the entire image, which both leads to the traceability of the to-be-protected DNN and significantly enhances the stealthiness and anti-forgery of the embedded poisoned trigger. As the amplitude of the embedded fingerprint is rather small, it poses a challenge that makes the model hard to capture features of the embedded fingerprint during the training process. To tackle this challenge, this study then proposes a poisoned feature enhancement module, which groups classification categories of clean images without the fingerprint as one type while taking the poisoned images with the correct fingerprint and the poisoned adversarial images containing the other incorrect fingerprints as another two different types. This module, which can be removed during fingerprint verification, improves the reliability and fidelity of the model fingerprinting. Accordingly, an adversarial training strategy is designed, aiming at enhancing the effective training of classification neural network models that contain the fingerprint with small embedding strength. Combining these strategies then gives rise to the proposed scheme, which is denoted the UfNet for notational convenience. Extensive experimental simulation results demonstrate the excellent concealment of the fingerprint in the poisoned image and the desirable verification of the fingerprint in case of even a one-bit difference. Additionally, it also exhibits robustness against the related attacks such as STRIP, Neural Cleanse, Fine-Pruning, and GradCAM, which outperforms the state-of-the-arts (SOTAs). Furthermore, the proposed UfNet is robust to collusion attacks and frame the innocent These results show the feasibility and effectiveness of the introduced model fingerprinting framework and the proposed implementation methodology in this paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call