Similarity-preserving hash for content-based audio retrieval using unsupervised deep neural networks

Petcharat Panyapanuwat,Luepol Pipanmaekaporn,Suwatchai Kamonsantiroj

doi:10.11591/ijece.v11i1.pp879-891

Abstract

Due to its efficiency in storage and search speed, binary hashing has become an attractive approach for a large audio database search. However, most existing hashing-based methods focus on data-independent scheme where random linear projections or some arithmetic expression are used to construct hash functions. Hence, the binary codes do not preserve the similarity and may degrade the search performance. In this paper, an unsupervised similarity-preserving hashing method for content-based audio retrieval is proposed. Different from data-independent hashing methods, we develop a deep network to learn compact binary codes from multiple hierarchical layers of nonlinear and linear transformations such that the similarity between samples is preserved. The independence and balance properties are included and optimized in the objective function to improve the codes. Experimental results on the Extended Ballroom dataset with 8 genres of 3,000 musical excerpts show that our proposed method significantly outperforms state-of-the-art data-independent method in both effectiveness and efficiency.

Full Text