Deep learning is increasingly being applied in the field of robust watermarking. However, the existing deep learning-based video watermarking methods only uses spatial domain information as the input and the robustness against attacks such as H.264/AVC compression is still not strong. Therefore, this paper proposes a deep learning-based robust video watermarking method in dual-tree complex wavelet transform (DT-CWT) domain. The video frames are transformed into the DT-CWT domain and the suitable high-pass subbands are selected as candidate embedding positions. Then, the 2D and 3D convolutions are combined to extract both intra-frame spatial features and inter-frame temporal features for finding the stable and imperceptible coefficients for watermark embedding in the candidate positions. The convolutional attention module (CBAM) is used to further adjust the embedding coefficients and strengths. In addition, the attack layer, where a differentiable proxy is specially designed in this paper for the simulation of non-differentiable H.264/AVC compression, is introduced to generate distorted watermarked videos for improving the robustness against different attacks. Experimental results show that our method is superior to both the existing deep learning-based methods and traditional methods in the robustness against both spatial and temporal attacks while preserving high video quality. The source code is available at https://github.com/imagecbj/A-DNN-Robust-Video-Watermarking-Method-in-DT-CWT-Domain.
Read full abstract