Abstract
An ideal audio retrieval system identifies a short query snippet from a massive audio database with both robustness and efficiency. Unfortunately, none of the existing systems could robustly handle all distortions while being efficient. Enhanced sampling and counting (eSC), the state-of-the-art audio retrieval method proposed for Philips-like fingerprints, has achieved both high efficiency and strong robustness, featuring time-stretch resistance, however, suffers from pitch-shift attacks. This article proposes a peak-point-based energy bands computation method to enhance Philips fingerprint with resistance to pitch-shift, and the resulting fingerprint is called peak-point based Philips fingerprint (PPF). Experimental results show that PPF can resist pitch-shift ranging from 70% to 130%, which enhances eSC with pitch-shift resistance while retaining all its benefits.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have