Similarity Search in the Blink of an Eye with Compressed Indices

Cecilia Aguerrebere,Mariano Tepper,Theodore Willke,Mark Hildebrand,Ishwar Singh Bhati

doi:10.14778/3611479.3611537

Abstract

Nowadays, data is represented by vectors. Retrieving those vectors, among millions and billions, that are similar to a given query is a ubiquitous problem, known as similarity search, of relevance for a wide range of applications. Graph-based indices are currently the best performing techniques for billion-scale similarity search. However, their random-access memory pattern presents challenges to realize their full potential. In this work, we present new techniques and systems for creating faster and smaller graph-based indices. To this end, we introduce a novel vector compression method, Locally-adaptive Vector Quantization (LVQ), that uses per-vector scaling and scalar quantization to improve search performance with fast similarity computations and a reduced effective bandwidth, while decreasing memory footprint and barely impacting accuracy. LVQ, when combined with a new high-performance computing system for graph-based similarity search, establishes the new state of the art in terms of performance and memory footprint. For billions of vectors, LVQ outcompetes the second-best alternatives: (1) in the low-memory regime, by up to 20.7x in throughput with up to a 3x memory footprint reduction, and (2) in the high-throughput regime by 5.8x with 1.4x less memory.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Similarity Search in the Blink of an Eye with Compressed Indices

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment

Lead the way for us

Journal: Proceedings of the VLDB Endowment	Publication Date: Jul 1, 2023
Citations: 4

Similar Papers

Similarity searching
Dagmar Stumpfe ... Jürgen Bajorath
WIREs Computational Molecular Science | VOL. 1
Dagmar Stumpfe, et. al.Dagmar Stumpfe ... Jürgen Bajorath
18 Feb 2011
WIREs Computational Molecular Science | VOL. 1

LiteHST: A Tree Embedding based Method for Similarity Search
Yuxiang Zeng ... Yongxin Tong
Proceedings of the ACM on Management of Data | VOL. 1
Yuxiang Zeng, et. al.Yuxiang Zeng ... Yongxin Tong
26 May 2023
Proceedings of the ACM on Management of Data | VOL. 1

On the Stationarity of Multivariate Time Series for Correlation-Based Data Analysis
Kiyoung Yang ... C Shahabi
-
Kiyoung Yang, et. al. Kiyoung Yang ... C Shahabi
27 Nov 2005
27 Nov 2005

Challenges and techniques for effective and efficient similarity search in large video databases
Jie Shao ... Heng Tao Shen
Proceedings of the VLDB Endowment | VOL. 1
Jie Shao, et. al.Jie Shao ... Heng Tao Shen
01 Aug 2008
Proceedings of the VLDB Endowment | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Similarity Search in the Blink of an Eye with Compressed Indices

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment