Abstract

We describe a distributed protocol that operates on replicated data objects—of arbitrary abstract types—represented in terms of the history of object events, rather than in terms of object values. In general, a site that stores a replica of an object has only partial knowledge of the object's event history. We call such a replica an object representative. The goal of the protocol is to limit the sizes of the histories by checkpointing and discarding old events. We treat separately the three functions of (1) propagating events to sites that do not know about them, (2) checkpointing, and (3) discarding old events. A site rolls forward its checkpoint state as far as it knows its local version of the object's history to be complete, but it discards old events only as far as it knows that all other sites know their local histories to be complete. Our protocol propagates events among sites in gossip messages exchanged in the background in a two phase manner. Each site maintains several timestamp vectors indicating the extent of its knowledge of other sites' knowledge of events. These timestamp vectors are included in gossip messages and used to decide the extent of global completeness of each site's local version of the event history. We formally define the correctness criteria for our protocol in terms of the completeness properties of histories. The protocol is fault-tolerant in that site and communication failures can only slow down its progress. The communication and processing required by our protocol are done in the background at times arbitrarily chosen by each site. Thus, the system need not quiesce for the activities of checkpointing and discarding old events to proceed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call