Abstract

Existing methods for Deep Neural Networks (DNN) watermarking either require accessing the internal parameters of the DNN models (white-box watermarking), or rely on backdooring to enforce a desired behavior of the model when the DNN is fed with a specific set of key input images (black-box watermarking). In this letter, we propose a black-box multi-bit DNN watermarking algorithm, suitable for multiclass classification networks, whereby the presence of the watermark can be retrieved from the output of the network in correspondence to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">any</i> input. To read the watermark, we first apply a power function to the softmax output of the DNN model to map it from an impulse-like to a smooth distibution. Then, we extract the watermark bits by projecting the output of the DNN onto a pseudorandom key vector. Watermark embedding is achieved by adding a proper regularizer term to the training loss. The effectiveness of the proposed method is demonstrated by applying it to various network architectures working on different datasets. The experimental results demonstrate the possibility to embed a robust watermark into the output of the host DNN with a negligible impact on the accuracy of the original task.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.