Main Memory Database Systems

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

This article provides an overview of recent developments in mainmemory database systems. With growing memory sizes and memory prices dropping by a factor of 10 every 5 years, data having a “primary home” in memory is now a reality. Main-memory databases eschew many of the traditional architectural pillars of relational database systems that optimized for disk-resident data. The result of these memory-optimized designs are systems that feature several innovative approaches to fundamental issues (e.g., concurrency control, query processing) that achieve orders of magnitude performance improvements over traditional designs. Our survey covers five main issues and architectural choices that need to be made when building a high performance main-memory optimized database: data organization and storage, indexing, concurrency control, durability and recovery techniques, and query processing and compilation. We focus our survey on four commercial and research systems: H-Store/VoltDB, Hekaton, HyPer, and SAP HANA. These systems are diverse in their design choices and form a representative sample of the state of the art in main-memory database systems. We also cover other commercial and academic systems, along with current and future research trends.

Similar Papers
  • Research Article
  • Cite Count Icon 17
  • 10.14778/3007263.3007321
Modern main-memory database systems
  • Sep 1, 2016
  • Proceedings of the VLDB Endowment
  • Per-Åke Larson + 1 more

This tutorial provides an overview of recent developments in main-memory database systems. With growing memory sizes and memory prices dropping by a factor of 10 every 5 years, data having a "primary home" in memory is now a reality. Main-memory databases eschew many of the traditional architectural tenets of relational database systems that optimized for disk-resident data. Innovative approaches to fundamental issues such as concurrency control and query processing are required to unleash the full performance potential of main-memory databases. The tutorial is focused around design issues and architectural choices that must be made when building a high performance database system optimized for main-memory: data storage and indexing, concurrency control, durability and recovery techniques, query processing and compilation, support for high availability, and ability to support hybrid transactional and analytics workloads. This will be illustrated by example solutions drawn from four state-of-the-art systems: H-Store/VoltDB, Hekaton, HyPeR, and SAP HANA. The tutorial will also cover current and future research trends.

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/3555041.3589402
Main Memory Database Recovery Strategies
  • Jun 4, 2023
  • Arlino Magalhaes + 2 more

Most of the current application scenarios, such as trading, real-time bidding, advertising, weather forecasting, social gaming, etc., require massive real-time data processing. Main memory database systems have proved to be an efficient alternative to such applications. These systems maintain the primary copy of the database in the main memory to achieve high throughput rates and low latency. However, a database in RAM is more vulnerable to failures than in traditional disk-oriented databases because of the memory volatility. DBMSs implement recovery activities (logging, checkpoint, and restart) for recovery proposes. Although the recovery component looks similar in disk- and memory-oriented systems, these systems differ dramatically in the way they implement their architectural components, such as data storage, indexing, concurrency control, query processing, durability, and recovery. This tutorial aims to provide a thorough review of in-memory database recovery techniques. To achieve this goal, we intend to review the main concepts of database recovery and architectural choices to implement an in-memory database system. Only then, we present the techniques to recover in-memory databases and discuss the recovery strategies of a representative sample of modern in-memory databases. Besides, the tutorial presents some aspects related to challenges and future directions of research in MMDBs in order to provide guidance for other researchers.

  • Conference Article
  • Cite Count Icon 7
  • 10.1109/dsn.2004.1311939
Checkpointing of control structures in main memory database systems
  • Jan 1, 2004
  • L Wang + 4 more

This paper proposes an application-transparent, low-overhead checkpointing strategy for maintaining consistency of control structures in a commercial main memory database (MMDB) system, based on the ARMOR (adaptive reconfigurable mobile object of reliability) infrastructure. Performance measurements and availability estimates show that the proposed checkpointing scheme significantly enhances database availability (an extra nine in improvement compared with major-recovery-based solutions) while incurring only a small performance overhead (less than 2% in a typical workload of real applications).

  • Conference Article
  • Cite Count Icon 22
  • 10.1109/icde.1987.7272358
Performance of complex queries in Main Memory Database Systems
  • Feb 1, 1987
  • Dina Bitton + 2 more

Memory residence can buy both functionality and performance for a database management system. In this paper, we present a description and a benchmark of an experimental implementation of a Main Memory Database System (MMDBS) that was designed to support complex interactive queries. We describe and evaluate the main memory database structures and query processing algorithms implemented in this prototype. Our measurements and analysis, focused on aggregates and joins, include both memory requirements and response time, since there is a clear trade-off between space and time in the design of a MMDBS. In contrast to conventional Disk-based Database Systems (DDBS's), we found that an MMDBS can efficiently execute complex relational queries. We identify strategies that exploit memory residence effectively. We also identified a number of performance problems related to query optimization in main memory and memory management for MMDBS's.

  • Research Article
  • Cite Count Icon 6
  • 10.1145/166635.166650
A performance study of concurrency control in a real-time main memory database system
  • Dec 1, 1993
  • ACM SIGMOD Record
  • Le Gruenwald + 1 more

Earlier performance studies of concurrency control algorithms show that in a disk-resident real-time database system, optimistic algorithms perform better than two phase locking with higher priority (2PL-HP). In a main memory real-time database system, disk I/Os are eliminated and thus more transactions are enabled to meet their real-time constraints. Lack of disk I/Os in this environment requires concurrency control be re-examined. This paper conducts a simulation study to compare 2PL-HP with a real time optimistic concurrency control algorithm (OPT-WAIT-50) for a real time main memory database system, MARS. The results show that OPT-WAIT-50 outperforms 2PL-HP with finite resources.

  • Conference Article
  • Cite Count Icon 30
  • 10.1109/ipdps.2002.1015485
A single phase distributed commit protocol for main memory database systems
  • Jan 1, 2002
  • Inseon Lee + 1 more

Distributed databases systems need commit processing so that transactions executing on them still preserve the ACID property. With the advance of main memory database systems which become possible due to dropping price and increasing capacity of the RAM and CPU, the database processing speed has been increased in one order of magnitude. However, when it comes to distributed commit processing, it is still very slow since disk logging has to precede the transaction commit where the database access does not incur any disk access at all in the case of main memory databases. In this paper, we re-evaluate the various distributed commit protocols and come up with a single phase distributed commit protocol suitable for the distributed main memory database systems. Our simulation study confirms that the new protocol greatly reduces the time it takes to commit distributed transactions without any consistency problem.

  • Book Chapter
  • 10.1007/978-94-007-2169-2_4
Lightweight Main Memory DB for Telecom Network Performance Management System
  • Jan 1, 2012
  • Lina Lan

Today telecom network is a growing complex. Although the amount of network performance data increased dramatically, telecom network operators require better performance on network performance data collection and analysis. Database is the important component in modern network management model. Since main memory database (MMDB) stores data in main physical memory and provides very high-speed access, MMDB can suffice the requirements on data intensive and real time response in network performance management system. This paper presents a novel lightweight design on MMDB for network performance data persistence. This design improves data access performance in following aspects. The data persistence mechanism employs user mode memory map provided by Unix OS. To reduce the cost of data copy and data interpretation, the data storage format is designed as consistent with binary format in application memory. The database is provided as program library and the application can access data in shared memory to avoid the cost on inter-process communication. Once data is updated in memory, query application can get updated data without disk I/O cost. The data access methods adopt multi-level RB-Tree structure. In best case, the algorithm complexity is O(N). In worst case, the algorithm complexity is O(N*lgN). In real performance data distribution scenarios, the complexity is nearly O(N).KeywordsNetwork management operation administration, maintenance, provisioning (OAM&P)Performance management (PM)Main memory database system (MMDB)Disk-resident database system (DRDBS)RB-Tree

  • Conference Article
  • Cite Count Icon 18
  • 10.1109/jcit.1990.128274
Log-driven backups: A recovery scheme for large memory database systems
  • Sep 1, 1990
  • E Levy + 1 more

A recovery scheme for main memory database systems (MMDBS) is presented. The scheme is both practical and unique compared to other proposals in this area, since it is geared to accommodate databases that are not necessarily memory-resident. The scheme capitalizes on the performance advantages offered by MMDBS, without precluding the possibility of having some portions of the database on secondary storage. The heart of the scheme is an innovative approach to recovery processing in MMDBS that eliminates expensive checkpointing activity, which is the commonly used alternative. The main idea is to have an auxiliary processor in charge of reading log records and applying updates to the disk databases accordingly, without accessing the main memory database at all. The advanced I/O technology of disk arrays is incorporated for the implementation of the approach. >

  • Conference Article
  • Cite Count Icon 2
  • 10.1145/502585.502686
Dynamic versioning concurrency control for index-based data access in main memory database systems
  • Oct 5, 2001
  • Ying Xia + 4 more

We present a concurrency control scheme using dynamic versioning for index-based data access in main memory database systems. This scheme enables read-only transactions read correct version without holding any locks or latches, while update transactions only obtain a few locks or latches without deadlocks. Efficient version management is designed to support high concurrency level and low space overhead. The interaction between dynamic versioning and indexing is considered so that all available versions can be accessed through indexing. Experiment results show that dynamic versioning can improve the performance in concurrent environment significantly.

  • Book Chapter
  • Cite Count Icon 11
  • 10.1007/978-3-642-03996-6_2
A Data Distribution Strategy for Scalable Main-Memory Database
  • Jan 1, 2009
  • Yunkui Huang + 4 more

Main-Memory Database (MMDB) System is more superior in less response times and higher transaction throughputs than traditional Disk- Resident Database (DRDB) System. But the high performance of MMDB depends on the single server's main memory capacity, which is restricted by hardware technologies and operating system. In order to resolve the contradiction between requirements of high performance and limited memory resource, we propose a scalable Main-Memory database system ScaMMDB which distributes data and operations to several nodes and makes good use of every node's resource. In this paper we'll present the architecture of ScaMMDB and discuss a data distribution strategy based on statistics and clustering. We evaluate our system and data distribution strategy by comparing with others. The results show that our strategy performs effectively and can improve the performance of ScaMMDB.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/iwia.2004.10015
Highly Functional Memory Architecture for Large-Scale Data Applications
  • Jan 12, 2004
  • K Tanaka + 1 more

Response time in database systems is not getting small as a processor speed is accelerating because of a growing gap between speed of the processor and that of a memory, and increase in data size. A conventional memory controller and caches in a processor cannot provide enough bandwidth of data transfer between a processor and memory. For fast processing with large data, it is effective to equip a memory controller with mechanisms for transferring large data and a processor with a buffer for receiving the data. In this paper, to accelerate query processing we propose the fast and large scale data transfer methods that take advantage of the data structure in main memory database systems and the characteristics of DRAM, and evaluate them in simulations on several queries. The simulation shows that the query processing with the proposed mechanisms exhibits about 10 times faster execution than a conventional method.

  • Research Article
  • 10.1006/jmca.1994.1008
MMDB partial reload
  • Apr 1, 1994
  • Journal of Microcomputer Applications
  • Le Gruenwald + 1 more

MMDB partial reload

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/parbse.1990.77125
Database partitioning techniques to support reload in a main memory database system: MARS
  • Mar 7, 1990
  • L Gruenwald + 1 more

The authors examine the effect of different partitioning techniques on the MMDB (main memory database) reload problem in terms of the number of I/Os for reload and number of MM references during transaction processing. The best technique is the one that yields the minimum overall cost with regard to both properties. It is shown that horizontal and single vertical partitioning are actually the only possible candidates. Physical vertical never yields the best result. In some very rare cases, group vertical outperforms the other techniques. If the database system encountered performs more selections than projections and joins, and more tuple modifications or tuple deletions than tuple insertions, then horizontal is the best technique. Otherwise, single vertical is the chosen technique. It is also shown that, if reload is the only concern, that is, if the transaction performance is not taken into account, then single vertical is always the best choice. >

  • Research Article
  • 10.1016/0169-023x(95)00009-h
An integrated data structure with multiple access paths for database systems and its performance
  • Jul 1, 1995
  • Data & Knowledge Engineering
  • Vijay Kumar + 1 more

An integrated data structure with multiple access paths for database systems and its performance

  • Conference Article
  • 10.1109/rtcsa.2000.896428
PRED-DF - a data flow based semantic concurrency control protocol for real-time main-memory database systems
  • Dec 12, 2000
  • A Munnich

In real-time systems, the use of databases is increasing. If the application is safety-critical, one must guarantee in advance, by suitable verification methods, that all deadlines hold in all possible cases of use. Predictability of the database's concurrency control protocol is one of the most important prerequisites for this. However, most real-time concurrency control protocols are influenced by traditional database requirements. For hard real-time systems, they are usually unsuitable, because they are not predictable and they noticeably interfere with task scheduling. We present a new semantic concurrency control protocol called PRED-DF (PREDeclaration and Data Flow analysis) for main-memory real-time database systems. PRED-DF uses pre-declaration, is locking-based and generates serializable schedules. It uses additional knowledge gained from advance analysis of the application's data flow to minimize blocking times. PRED-DF's behavior is predictable and so the verification of real-time requirements is possible.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.