Abstract

Aiming at content-based audio retrieval (CBAR) applications, a robust audio hashing scheme is proposed. First the audio is divided to frame by fixed length and then low-frequent and high-frequent components are obtained by three-level lifting-based wavelet transformation in every frame. Secondly the audio frame is approximately represented as a product of a base matrix and an encoding matrix, or coefficient matrix, using non-negative matrix factorization (NMF). Finally the sum of each column in the coefficient matrix is calculated, which is then quantized to produce one bit of the hash sequence. Experiment results show that the proposed scheme is robust against Mp3 compression, Real compression, filtering, amplitude compression, equalization, echo, etc. It is insensitive to small local change, and therefore is suitable for distinguishing different audios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call