Abstract

This talk describes new approximate nearest‐neighbor methods employed in a scalable audio‐feature database system called “AudioDB.” This open‐source system is designed to scale to storing and searching hundreds of millions of feature vectors on standard UNIX workstation platforms. A radius‐bounded nearest‐neighbor vector‐sequence search algorithm, based on locality sensitive hashing (LSH), achieves sublinear retrieval times at this scale. The performance of the LSH‐based algorithm depends critically on the choice of radius bound supplied—the wrong value impacts retrieval accuracy or retrieval time. An optimal radius estimator is derived by modeling the minimum value distribution of a random sample of a data set’s pairwise distance distribution. When used with LSH this yields accurate search results with retrieval times several orders of magnitude faster than exhaustive search methods and space‐partitioning methods. The same statistical sampling method is used to perform retrieval tasks at successively higher levels of specificity on labeled or unlabeled audio collections. The result is a system that (a) unifies audio retrieval tasks across a range of specificities, using the statistical framework of background distance‐distribution sampling and hypothesis testing (b) is as accurate as exhaustive search methods and (c) is three orders of magnitude faster than exhaustive search methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call