Improving ZooKeeper Atomic Broadcast Performance When a Server ’orum Never Crashes

Ibrahim El-Sanosi,Pd Ezhilchelvan

doi:10.4108/eai.12-1-2018.154177

Abstract

Operating at the core of the highly-available ZooKeeper system is the ZooKeeper atomic broadcast (Zab) for imposing a total order on service requests that seek to modify the replicated system state. Zab is designed with the weakest assumptions possible under crash-recovery fault model; e.g., any number - even all - of servers can crash simultaneously and the system will continue or resume its service provisioning when a server quorum remains or resumes to be operative. Our aim is to explore ways of improving Zab performance without modifying its easy-to-implement structure. To this end, we first assume that server crashes are independent and a server quorum remains operative at all time. Under these restrictive, yet practical, assumptions, we propose three variations of Zab and do performance comparison. The first variation orders excellent performance but can be only used for 3-server systems; the other two do not have this limitation. One of them reduces the leader overhead further by conditioning the sending of acknowledgements on the outcomes of coin tosses. Owing to its superb performance, it is re-designed to operate under the least-restricted Zab fault assumptions. Further performance comparisons confirm the potential of coin-tossing in ordering performances better than Zab, particularly at high workloads.

Highlights

Operating at the core of the highly-available ZooKeeper system is the ZooKeeper atomic broadcast (Zab) for imposing a total order on service requests that seek to modify the replicated system state
At the heart of ZooKeeper is the ZooKeeper atomic broadcast protocol, Zab for short, to ensure that the service state reveals that ZooKeeper throughput decreases gradually as the write requests outnumber the read requests in a cluster of any size
The aim of this paper is to explore ways of improving Zab performance, at high work loads, by primarily shitting some of the leader load onto other nodes, while at the same time maintaining the well-understood and implementation-friendly structure Zab itself

Summary

Z K A B

ZooKeeper implements replicated services using an ensemble of N , N ≥ 3, connected servers. A1 - Server Crashes: A server can crash at any time and recover a er a downtime of arbitrary duration. It has a stable store or log and the log contents survive a crash. Read requests are serviced by the receiving server itself. As illustrated, are rst subject to total ordering through an execution of ZooKeeper atomic broadcast (Zab) protocol and are processed concurrently by all servers as per the order decided. One of the Zab processes is designated as the leader and the rest as followers. As in 2-Phase commit protocol, only the leader can initiate atomic broadcasting of m, abcast(m) for short, and the followers execute Zab by responding to what they receive.

Broadcast Write

Zab Protocol

Assumptions

De nitions and Lemma

Design Approach

Protocol 2

Switching Between Zab and ZabCt

Observations

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving ZooKeeper Atomic Broadcast Performance When a Server ’orum Never Crashes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ICST Transactions on Ubiquitous Environments

Lead the way for us

Journal: ICST Transactions on Ubiquitous Environments	Publication Date: Jan 12, 2018
License type: cc-by

Similar Papers

A new look at atomic broadcast in the asynchronous crash-recovery model
S Mena ... A Schiper
-
S Mena, et. al.S Mena ... A Schiper
26 Oct 2005
26 Oct 2005

Improving ZooKeeper Atomic Broadcast Performance When a Server Quorum Never Crashes
Ibrahim El-Sanosi ... Paul Ezhilchelvan
EAI Endorsed Transactions on Energy Web | VOL. 5
Ibrahim El-Sanosi, et. al.Ibrahim El-Sanosi ... Paul Ezhilchelvan
10 Apr 2018
EAI Endorsed Transactions on Energy Web | VOL. 5

Psched: A Priority-Based Service Scheduling Scheme for the Internet of Drones
Cong Pu ... Logan Carpenter
IEEE systems journal | VOL. 15
Cong Pu, et. al.Cong Pu ... Logan Carpenter
15 Jun 2020
IEEE systems journal | VOL. 15

A Parallel Document Engine Built on Top of a Cluster of Databases - Design, Implementation, and Experiences -
...
-
, et. al. ...
01 Jan 1999
01 Jan 1999

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving ZooKeeper Atomic Broadcast Performance When a Server ’orum Never Crashes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ICST Transactions on Ubiquitous Environments