Linear Functionality Equivalence Attack Against Deep Neural Network Watermarks and a Defense Method by Neuron Mapping

Fang-Qi Li,Alan Wee-Chung Liew,Shi-Lin Wang

doi:10.1109/tifs.2023.3259881

Abstract

As an ownership verification technique for deep neural networks, the white-box neural network watermark is being challenged by the functionality equivalence attack. By leveraging the structural symmetry within a deep neural network and manipulating the parameters accordingly, an adversary can invalidate almost all white-box watermarks without affecting the network’s performance. This paper introduces the linear functionality equivalence attack, which can adapt to different network architectures without requiring knowledge of either the watermark or data. We also propose NeuronMap, a framework that can efficiently neutralize linear functionality equivalence attacks and can be easily combined with existing white-box watermarks to enhance their robustness. Experiments conducted on several deep neural networks and state-of-the-art white-box watermarking schemes have demonstrated not only the destructive power of linear functionality equivalence attacks but also the defense capability of NeuronMap. Our result shows that the threat of basic linear functionality equivalence attacks against deep neural network watermarks can be effectively solved using NeuronMap.

Full Text