With more and more multimedia content made available on the Internet, music information retrieval is becoming a critical but challenging research topic, especially for real-time online search of similar songs from websites. In this paper we study how to quickly and reliably retrieve relevant songs from a large-scale dataset of music audio tracks according to melody similarity. Our contributions are two-fold: (i) Compact and accurate representation of audio tracks by exploiting music semantics. Chord progressions are recognized from audio signals based on trained music rules, and the recognition accuracy is improved by multi-probing. A concise chord progression histogram (CPH) is computed from each audio track as a mid-level feature, which retains the discriminative capability in describing audio content. (ii) Efficient organization of audio tracks according to their CPHs by using only one locality sensitive hash table with a tree-structure. A set of dominant chord progressions of each song is used as the hash key. Average degradation of ranks is further defined to estimate the similarity of two songs in terms of their dominant chord progressions, and used to control the number of probing in the retrieval stage. Experimental results on a large dataset with 74,055 music audio tracks confirm the scalability of the proposed retrieval algorithm. Compared to state-of-the-art methods, our algorithm improves the accuracy of summarization and indexing, and makes a further step towards the optimal performance determined by an exhaustive sequence comparison.
Read full abstract