Abstract

Deep neural networks (DNNs) are widely used in real-world applications, thanks to their exceptional performance in image recognition. However, their vulnerability to attacks, such as Trojan and data poison, can compromise the integrity and stability of DNN applications. Therefore, it is crucial to verify the integrity of DNN models to ensure their security. Previous research on model watermarking for integrity detection has encountered the issue of overexposure of model parameters during embedding and extraction of the watermark. To address this problem, we propose a novel score-based black-box DNN fragile watermarking framework called fragile trigger generation (FTG). The FTG framework only requires the prediction probability distribution of the final output of the classifier during the watermarking process. It generates different fragile samples as the trigger, based on the classification prediction probability of the target classifier and a specified prediction probability mask to watermark it. Different prediction probability masks can promote the generation of fragile samples in corresponding distribution types. The whole watermarking process does not affect the performance of the target classifier. When verifying the watermarking information, the FTG only needs to compare the prediction results of the model on the samples with the previous label. As a result, the required model parameter information is reduced, and the FTG only needs a few samples to detect slight modifications in the model. Experimental results demonstrate the effectiveness of our proposed method and show its superiority over related work. The FTG framework provides a robust solution for verifying the integrity of DNN models, and its effectiveness in detecting slight modifications makes it a valuable tool for ensuring the security and stability of DNN applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.