Abstract

We report results on audio copy detection for TRECVID 2009 copy detection task. This task involves searching for transformed audio queries in over 385 hours of test audio. The queries were transformed in seven different ways, three of them involved mixing unrelated speech to the original query, making it a much more difficult task. We give results with two different audio fingerprints and show that mapping each test frame to the nearest query frame (nearest-neighbor fingerprint) results in robust audio copy detection. The most difficult task in TRECVID 2009 was to detect audio copies using predetermined thresholds computed from 2008 data. We show that the nearest-neighbor fingerprints were robust to even this task and gave actual minimal normalized detection cost rate (NDCR) of around 0.06 for all the transformations. These results are close to those obtained by using the optimal threshold for each transform. This result shows the robustness of the nearest-neighbor fingerprints. These nearest-neighbor fingerprints can be efficiently computed on a graphics processing unit, leading to a very fast search.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.