GeoRep—Resilient Storage for Wide Area Networks

Daniel Brahneborg,Saad Mubeen,Romaric Duvignau,Wasif Afzal

doi:10.1109/access.2022.3191686

Abstract

Embedded systems typically have limited processing and storage capabilities, and may only intermittently be powered on. After sending an event upstreams with data from its sensors, the system must therefore be able to trust that the data, once acknowledged, is not lost. The purpose of this work is to propose a novel solution for replicating data between the upstreams nodes in such systems, with a minimal effect on the software architecture. Based on the assumption that there would be no relative order between replicated data tuples, we designed a new replication protocol which uses only 2 communication steps per data tuple, instead of between 3 and 12 used by other solutions. We verified its failover mechanism in a proof-of-concept implementation of the protocol using simulated network failures, and evaluated the implementation on throughput and latency in several controlled experiments using up to 7 nodes in up to 5 geographically separated areas, with up to 1000 data producers per node. The recorded system throughput increased linearly relative to both the number of nodes and the number of data producers. For comparison, with 3 nodes Paxos showed a similar performance as our protocol, but it instead got slower when nodes were added. The lack of a relative order, in combination with partial replication, enables our system to continue working during network partitions, not only in the part containing the majority of the nodes, but also in any sufficiently large minority partitions.

Full Text