Shearlet Based Video Fingerprint for Content-Based Copy Detection

Fang Yuan,Keith W Cheung,Kaman Wong,Weihua Jian,Xuyuan Xu,Mengyang Liu,Lam-Man Po

doi:10.4236/jsip.2016.72010

Abstract

Content-based copy detection (CBCD) is widely used in copyright control for protecting unauthorized use of digital video and its key issue is to extract robust fingerprint against different attacked versions of the same video. In this paper, the “natural parts” (coarse scales) of the Shearlet coefficients are used to generate robust video fingerprints for content-based video copy detection applications. The proposed Shearlet-based video fingerprint (SBVF) is constructed by the Shearlet coefficients in Scale 1 (lowest coarse scale) for revealing the spatial features and Scale 2 (second lowest coarse scale) for revealing the directional features. To achieve spatiotemporal natural, the proposed SBVF is applied to Temporal Informative Representative Image (TIRI) of the video sequences for final fingerprints generation. A TIRI-SBVF based CBCD system is constructed with use of Invert Index File (IIF) hash searching approach for performance evaluation and comparison using TRECVID 2010 dataset. Common attacks are imposed in the queries such as luminance attacks (luminance change, salt and pepper noise, Gaussian noise, text insertion); geometry attacks (letter box and rotation); and temporal attacks (dropping frame, time shifting). The experimental results demonstrate that the proposed TIRI-SBVF fingerprinting algorithm is robust on CBCD applications on most of the attacks. It can achieve an average F1 score of about 0.99, less than 0.01% of false positive rate (FPR) and 97% accuracy of localization.

Highlights

Tens of thousands of videos are being uploaded to the Internet and shared everyday with about 300 hours uploadHow to cite this paper: Yuan, F., Po, L.-M., Liu, M.Y., Xu, X.Y., Jian, W.H., Wong, K. and Cheung, K.W. (2016) Shearlet Based Video Fingerprint for Content-Based Copy Detection
The normalized Hamming distance (NHD) is a well-known metric to measure the similarity between different fingerprints, which is equal to the different bit counts between two fingerprints with normalization of length
The combined-1 distortion emphases on luminance attacks, which combine the distortions of luminance change, salt and pepper noise, Gaussian noise, JPEG compression and text insertion

Summary

Introduction

How to cite this paper: Yuan, F., Po, L.-M., Liu, M.Y., Xu, X.Y., Jian, W.H., Wong, K. and Cheung, K.W. (2016) Shearlet Based Video Fingerprint for Content-Based Copy Detection. The computational and memory requirements of applying the 3-D transform to a video are very high especially for real-time applications To tackle this problem, Esmaeili, M.M. et al proposed to use temporally informative representative images (TIRIs) [10] [11] of short video segments for fingerprints generation such that spatial and temporal information can be represented in the generated TIRI-based fingerprints. Esmaeili, M.M. et al proposed to use temporally informative representative images (TIRIs) [10] [11] of short video segments for fingerprints generation such that spatial and temporal information can be represented in the generated TIRI-based fingerprints They developed a TIRI2D-DCT based fingerprinting system that has been demonstrated to be outperformed the 3D-DCT based fingerprinting system. We attempt to use the “natural parts” (coarse scales) of the Shearlet coefficients to design a robust transformation-invariant video fingerprint for content-based video copy detection applications.

Shearlet Transform

TIRI-SBVF Based Content-Based Copy Detection System

Experimental Results

TIRI Based CBCD Systems Evaluation

Conclusions