Abstract
As the demand of high-speed stream processing grows, in-memory databases are widely used to analyze streaming data. It is challenging for in-memory systems to meet the requirements of high throughput and data persistence at the same time since data are not stored in disks. ARIES logging and command logging are two popular logging methods. In current applications, both ARIES logging and command logging are necessary. However, no checkpointing mechanism includes both the functions of ARIES logging method and command logging method. Besides, adopting ARIES logging method in an in-memory database creates high overhead. Command logging records redundant commands and has high storage cost. To address the above issues, we utilize order-irrelevant characteristics of data structure and incremental checkpointing concepts to devise a data structure based incremental checkpointing (DSIC) mechanism. DSIC mechanism is a very low overhead checkpointing approach while retaining the features of ARIES logging and command logging. DSIC mechanism reduces more than 70 percent logging time of the existing logging scheme and saves 40 percent storage costs of the existing logging scheme.
Highlights
Streaming data technology has generated great interest due to the growing demands for rapid analysis
Since data are not stored in disks, it is challenging for inmemory systems to meet the requirements of high throughput and data persistence at the same time
We propose a data structure based incremental checkpointing (DSIC) mechanism to provide an efficient checkpointing process while preserving the functionalities of ARIES logging method and command logging method for in-memory databases
Summary
Streaming data technology has generated great interest due to the growing demands for rapid analysis. Streaming data are collected in real-time through sensors, GPS, social media activities, and e-Commerce, etc. Real-time streaming data analysis is expected to create new services and improve decision making [1], [2]. To achieve real-time analysis, streaming data are analyzed and stored in the in-memory systems [3]. Autonomous cars collect and analyze traffic data in the in-memory system to provide passengers an efficient and safer transportation environment [4], [5]. The route of autonomous electric buses are recorded in memory to predict power-saving routing and charging locations in real-time [6]. Snapshot and log-based methods are common data persistence and recovery models for in-memory databases [10]. High overhead was incurred by ARIES logging method since all the transac-
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have