Availability Enhancement of Riak TS Using Resource-Aware Mechanism

Ling Li,Peng Zhang,Feng Ye,Ming Lu,Zihao Liu

doi:10.1155/2019/2189125

Abstract

The dependability and elasticity of various NoSQL stores in critical application are still worth studying. Currently, the cluster and backup technologies are commonly used for improving NoSQL availability, but these approaches do not consider the availability reduction when NoSQL stores encounter performance bottlenecks. In order to enhance the availability of Riak TS effectively, a resource-aware mechanism is proposed. Firstly, the data table is sampled according to time, the correspondence between time and data is acquired, and the real-time resource consumption is recorded by Prometheus. Based on the sampling results, the polynomial curve fitting algorithm is used to constructing prediction curve. Then the resources required for the upcoming operation are predicted by the time interval in the SQL statement, and the operation is evaluated by comparing with the remaining resources. Using the real hydrological sensor dataset as experimental data, the effectiveness of the mechanism is experimented in two aspects of sensitivity and specificity, respectively. The results show that through the availability enhancement mechanism, the average specificity is 80.55% and the sensitivity is 76.31% which use the initial sampling dataset. As training datasets increase, the specificity increases from 80.55% to 92.42%, and the sensitivity increases from 76.31% to 87.90%. Besides, the availability increases from 40.33% to 89.15% in hydrological application scenarios. Experimental results show that this resource-aware mechanism can effectively prevent potential availability problems and enhance the availability of Riak TS. Moreover, as the number of users and the size of the data collected grow, our method will become more accurate and perfect.

Highlights

As the world gets more instrumented and connected, we are witnessing a flood of digital data generated from diversified hardware or software in the format of big data
As is known to all, more than two hundred NoSQL databases usually have the very different characteristics, and many mainstream NoSQL stores have been adopted for big data applications in different fields, such as Redis [2], HBase [3], MongoDB [4], Druid [5], and Riak TS [6]
The core idea is that the data in Riak TS is sampled to obtain the correspondence between time and data size, while the real-time resource consumption is recorded by Prometheus [9] and the relationship between data size and resources consumption is obtained

Summary

Introduction

As the world gets more instrumented and connected, we are witnessing a flood of digital data generated from diversified hardware (e.g., sensors) or software in the format of big data It is difficult for the traditional storage represented by relational database to deal with largescale batch or stream data effectively. In our hydrological application system, Riak TS is adopted for storing hydrological sensor stream data, which is a well-known enterprise-grade NoSQL time series database optimized for IoT and time series data. The core idea is that the data in Riak TS is sampled to obtain the correspondence between time and data size, while the real-time resource consumption is recorded by Prometheus [9] and the relationship between data size and resources consumption is obtained This relationship is used to establish predict model. Based on the real hydrological sensor dataset and application scenario, the effectiveness of the proposed mechanism is verified.

Related Work

The Proposed Methodology

Results and Discussion

Conclusions