Achieving Probabilistic Atomicity With Well-Bounded Staleness and Low Read Latency in Distributed Datastores

Lingzhi Ouyang,Jian Lu,Yu Huang,Hengfeng Wei

doi:10.1109/tpds.2020.3034328

Lingzhi Ouyang, Jian Lu + Show 2 more

Open Access

https://doi.org/10.1109/tpds.2020.3034328

Copy DOI

Abstract

Although it has been commercially successful to deploy weakly consistent but highly-responsive distributed datastores, the tension between developing complex applications and obtaining only weak consistency guarantees becomes more and more severe. The almost strong consistency tradeoff aims at achieving both strong consistency and low latency in the common case. In distributed storage systems, we investigate the generic notion of almost strong consistency in terms of designing fast read algorithms while guaranteeing Probabilistic Atomicity with well-Bounded staleness (PAB). This problem has been explored in the case where only one client can write the data. However, the more general case where multiple clients can write the data has not been studied. In this article, we study the fast read algorithm for PAB in the multi-writer case. We show the bound of data staleness and the probability of atomicity violation by decomposing inconsistent reads into the read inversion and the write inversion patterns. We implement the fast read algorithm and evaluate the consistency-latency tradeoffs based on the instrumentation of Cassandra and the YCSB benchmark framework. The theoretical analysis and the experimental evaluations show that our fast read algorithm guarantees PAB, even when faced with dynamic changes in the computing environment.

Highlights

NOWADAYS cloud-based distributed datastores are expected to provide always-available and highly responsive services for millions of user requests across the world [1], [2], [3]
In distributed storage systems, we investigate the generic notion of almost strong consistency in terms of Probabilistic Atomicity with well-Bounded staleness (PAB) 1
We study the effects of practical system optimizations which are beyond accurate theoretical analysis

Summary

Introduction

NOWADAYS cloud-based distributed datastores are expected to provide always-available and highly responsive services for millions of user requests across the world [1], [2], [3]. To this end, data replication is typically employed. By replicating data into multiple replicas across different machines or even across data centers, distributed datastores can reduce response time of user requests, and tolerate certain degree of software/hardware failures and network partitions [4], [5]. Since cloud-based datastores must tolerate network partitions, according to the CAP theorem, once the datastore replicates data, the tradeoff between data consistency and data access latency comes up [6], [7]. It is claimed that a slight increase in user-perceived latency translates into concrete revenue loss [9]

Methods

Results

Conclusion