Abstract

How to precisely and efficiently detect near-duplicate copies with complicated audiovisual transformations from a large-scale video database is a challenging task. To cope with this challenge, this article proposes a transformation-aware soft cascading (TASC) approach for multimodal video copy detection. Basically, our approach divides query videos into some categories and then for each category designs a transformation-aware chain to organize several detectors in a cascade structure. In each chain, efficient but simple detectors are placed in the forepart, whereas effective but complex detectors are located in the rear. To judge whether two videos are near-duplicates, a Detection-on-Copy-Units mechanism is introduced in the TASC, which makes the decision of copy detection depending on the similarity between their most similar fractions, called copy units (CUs), rather than the video-level similarity. Following this, we propose a CU search algorithm to find a pair of CUs from two videos and a CU-based localization algorithm to find the precise locations of their copy segments that are with the asserted CUs as the center. Moreover, to address the problem that the copies and noncopies are possibly linearly inseparable in the feature space, the TASC also introduces a flexible strategy, called soft decision boundary , to replace the single threshold strategy for each detector. Its basic idea is to automatically learn two thresholds for each detector to examine the easy-to-judge copies and noncopies, respectively, and meanwhile to train a nonlinear classifier to further check those hard-to-judge ones. Extensive experiments on three benchmark datasets showed that the TASC can achieve excellent copy detection accuracy and localization precision with a very high processing efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call