Abstract

The dependability and elasticity of various NoSQL stores in critical application are still worth studying. Currently, the cluster and backup technologies are commonly used for improving NoSQL availability, but these approaches do not consider the availability reduction when NoSQL stores encounter performance bottlenecks. In order to enhance the availability of Riak TS effectively, a resource-aware mechanism is proposed. Firstly, the data table is sampled according to time, the correspondence between time and data is acquired, and the real-time resource consumption is recorded by Prometheus. Based on the sampling results, the polynomial curve fitting algorithm is used to constructing prediction curve. Then the resources required for the upcoming operation are predicted by the time interval in the SQL statement, and the operation is evaluated by comparing with the remaining resources. Using the real hydrological sensor dataset as experimental data, the effectiveness of the mechanism is experimented in two aspects of sensitivity and specificity, respectively. The results show that through the availability enhancement mechanism, the average specificity is 80.55% and the sensitivity is 76.31% which use the initial sampling dataset. As training datasets increase, the specificity increases from 80.55% to 92.42%, and the sensitivity increases from 76.31% to 87.90%. Besides, the availability increases from 40.33% to 89.15% in hydrological application scenarios. Experimental results show that this resource-aware mechanism can effectively prevent potential availability problems and enhance the availability of Riak TS. Moreover, as the number of users and the size of the data collected grow, our method will become more accurate and perfect.

Highlights

  • As the world gets more instrumented and connected, we are witnessing a flood of digital data generated from diversified hardware or software in the format of big data

  • As is known to all, more than two hundred NoSQL databases usually have the very different characteristics, and many mainstream NoSQL stores have been adopted for big data applications in different fields, such as Redis [2], HBase [3], MongoDB [4], Druid [5], and Riak TS [6]

  • The core idea is that the data in Riak TS is sampled to obtain the correspondence between time and data size, while the real-time resource consumption is recorded by Prometheus [9] and the relationship between data size and resources consumption is obtained

Read more

Summary

Introduction

As the world gets more instrumented and connected, we are witnessing a flood of digital data generated from diversified hardware (e.g., sensors) or software in the format of big data It is difficult for the traditional storage represented by relational database to deal with largescale batch or stream data effectively. In our hydrological application system, Riak TS is adopted for storing hydrological sensor stream data, which is a well-known enterprise-grade NoSQL time series database optimized for IoT and time series data. The core idea is that the data in Riak TS is sampled to obtain the correspondence between time and data size, while the real-time resource consumption is recorded by Prometheus [9] and the relationship between data size and resources consumption is obtained This relationship is used to establish predict model. Based on the real hydrological sensor dataset and application scenario, the effectiveness of the proposed mechanism is verified.

Related Work
The Proposed Methodology
Results and Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call