Abstract
Spatio-temporal alignments and estimation of distortion model between pirate and master video contents are prerequisites, in order to approximate the illegal capture location in a theater. State-of-the-art techniques are exploiting only visual features of videos for the alignment and distortion model estimation of watermarked sequences, while few efforts are made toward acoustic features and non-watermarked video contents. To solve this, we propose a distortion model estimation framework based on multimodal signatures, which fully integrates several components: Compact representation of a video using visual-audio fingerprints derived from Speeded Up Robust Features and Mel-Frequency Cepstral Coefficients; Segmentation-based bipartite matching scheme to obtain accurate temporal alignments; Stable frame pairs extraction followed by filtering policies to achieve geometric alignments; and distortion model estimation in terms of homographic matrix. Experiments on camcorded datasets demonstrate the promising results of the proposed framework compared to the reference methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.