Audio hash function based on non-negative matrix factorisation of mel-frequency cepstral coefficients

N Chen,W Wan,H.-D Xiao

doi:10.1049/iet-ifs.2010.0097

Abstract

Robust audio hash function defines a feature vector that characterises the audio signal, independent of content preserving manipulations, such as MP3 compression, amplitude boosting/cutting, low-pass filtering etc. In this study, the authors propose a new audio hash function based on the non-negative matrix factorisation (NMF) of mel-frequency cepstral coefficients (MFCCs). Their work is motivated by the fact that the orthogonality constraints in the singular value decomposition (SVD) make the low-rank singular vectors of audio with distinct local difference be the same. Thus, the available hash function based on SVD of MFCCs cannot achieve satisfactory discrimination. Although the non-negative constraints of NMF result in the basis that captures the local feature of the audio, thereby significantly reducing misclassification. Experimental results over large audio databases demonstrate that the proposed scheme achieves better performances, in terms of perceptual robustness and discrimination, than the available SVD-MFCCs-based hash function.

Full Text