Abstract
Concurrent operations on tree like data structures is a cornerstone of any database system. Concurrent operations intended for improving read\\write performance and usually implemented via some way of locking. Deadlock-free methods of concurrency control are known as tree locking protocols. These protocols provide basic operations(verbs) and algorithm (ways of operation invocations) for applying it to any tree-like data structure. These algorithms operate on data, managed by storage engine which are very different among RDBMS implementations. In this paper, we discuss tree locking protocol implementation for General inverted index (Gin) applied to multiversion concurrency control (MVCC) storage engine inside PostgreSQL RDBMS. After that we introduce improvements to locking protocol and provide usage statistics about evaluation of our improvement in very high load environment in one of the world’s largest IT company.
Highlights
Generalized inverted index (Gin) is special tree-like data structure used in PostgreSQL for improving performance on set data types, such as arrays
Gin define 4 different built-in strategies for arrays it can accelerate [2]: x overlap – two arrays overlaps in any greater than zero number of values, x contains – every element of the right array exist in the left array, x is contained by – every element of the left array exist in the right array, x equal – array on the left equals array on the right
A Gin index consists of a B-tree index constructed over key values, where each key is an element of some indexed items i.e. element of array, and where each tuple in a leaf page contains either a pointer to a Btree over item pointers, or a simple list of item pointers if the list is small enough
Summary
1. Introduction Generalized inverted index (Gin) is special tree-like data structure used in PostgreSQL for improving performance on set data types, such as arrays. A Gin index consists of a B-tree index constructed over key values, where each key is an element of some indexed items i.e. element of array, and where each tuple in a leaf page contains either a pointer to a Btree over item pointers (posting tree), or a simple list of item pointers (posting list) if the list is small enough. Compression is used for both the lists stored in-line in entry tree items, and in posting tree leaf pages.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have