ECROs: building global scale systems from sequential code

Kevin De Porre,Elisa Gonzalez Boix,Carla Ferreira,Nuno Preguiça

doi:10.1145/3485484

Kevin De Porre, Elisa Gonzalez Boix + Show 2 more

Open Access

https://doi.org/10.1145/3485484

Copy DOI

Abstract

To ease the development of geo-distributed applications, replicated data types (RDTs) offer a familiar programming interface while ensuring state convergence, low latency, and high availability. However, RDTs are still designed exclusively by experts using ad-hoc solutions that are error-prone and result in brittle systems. Recent works statically detect conflicting operations on existing data types and coordinate those at runtime to guarantee convergence and preserve application invariants. However, these approaches are too conservative, imposing coordination on a large number of operations. In this work, we propose a principled approach to design and implement efficient RDTs taking into account application invariants. Developers extend sequential data types with a distributed specification, which together form an RDT. We statically analyze the specification to detect conflicts and unravel their cause. This information is then used at runtime to serialize concurrent operations safely and efficiently. Our approach derives a correct RDT from any sequential data type without changes to the data type's implementation and with minimal coordination. We implement our approach in Scala and develop an extensive portfolio of RDTs. The evaluation shows that our approach provides performance similar to conflict-free replicated data types for commutative operations, and considerably improves the performance of non-commutative operations, compared to existing solutions.

Highlights

Geo-replication is a popular technique employed by distributed applications to reduce user-observed latencies as replicas are geographically closer to the clients
We introduce Explicitly Consistent Replicated Objects (ECROs): replicated data type (RDT) that are derived from sequential data types, based on a distributed specification that declares the application semantics by means of invariants over replicated state
5.1 Portfolio of ECRO Data Types We present a portfolio of RDTs that we implemented with ECROs and integrated in Squirrel10

Summary

Introduction

Geo-replication is a popular technique employed by distributed applications to reduce user-observed latencies as replicas are geographically closer to the clients. Developers of geo-distributed applications, face a difficult choice between availability and consistency [Brewer 2012, 2000; Kleppmann 2015]. Ensuring strong consistency requires coordination to enforce a total order of updates across all replicas. This increases latency, which translates into reduced performance, and decreased (offline) availability. When adopting weaker consistency guarantees (e.g. eventual consistency (EC) [Vogels 2009]), replicas can execute operations without coordination.

Methods

Results

Conclusion