Investigating Music Collections at Different Scales with AudioDB

Christophe Rhodes,Tim Crawford,Michael Casey,Mark D'Inverno

doi:10.1080/09298215.2010.516832

Abstract

Content-based search of music collections presents differing challenges at different scales and according to the task at hand. In this paper, we consider a number of different use cases for content-based similarity search, at scales ranging between a detailed investigation of a single track to searching for fragments of a track against a collection of millions of media items. We pay particular attention to the varying tradeoff between precision and recall in these contexts, both from the point of view of system evaluation and from the point of view of a user of a system searching an unknown collection. We present the audioDB software for content-based search, and describe how it has been used to address use cases across these different collection sizes; in addition we show that the interpretation of similarity as a distance which can be modelled statistically, initially motivated by our desire to achieve sublinear retrieval time on large databases, can be used to improve the precision of searches over small and medium-sized collections.

Full Text