Abstract

In this chapter we implement a cache based shared memory system and prove that it is sequentially consistent. Sequential consistency means: i) answers of read accesses to the memory system behave as if all accesses to the memory system were performed in some sequential order and ii) this order is consistent with the local order of accesses [7]. Cache coherence is maintained by the classical MOESI protocol as introduced in [16]. That a sequentially consistent shared memory system can be built at the gate level is in a sense the fundamental result of multi-core computing. Evidence that it holds is overwhelming: such systems are since decades part of commercial multi-core processors. Much to our surprise, when preparing the lectures for this chapter, we found in the open literature only one (undocumented) published gate level design of a cache based shared memory system [17]. Closely related to our subject, there is of course also an abundance of literature in the model checking community showing for a great variety of cache protocols, that desirable invariants - including cache coherence - are maintained, if accesses to the memory system are performed atomically at arbitrary caches in an arbitrary sequential order. In what follows we will call this variety of protocols atomic protocols. For a survey on the verification techniques for cache coherence protocols see [13], and for the model checking of the MOESI protocol we refer the reader to [4]. Atomic protocols and shared memory hardware differ in several important aspects: Accesses to shared memory hardware are as often as possible performed in parallel. After all, the purpose of multi-core computing is gaining speed by parallelism. If memory accesses were sequential as in the atomic protocols, memory would be a sequential bottleneck. Accesses to cache based hardware memory systems take one, two, or many more hardware cycles. Thus, they are certainly not performed in an atomic fashion. Fortunately, we will be able to use the model checked invariants literally as lemmas in the hardware correctness proof presented here, but very considerable extra proof effort will be required to establish a simulation between the hardware computation and the atomic protocol. After it is established one can easily conclude sequential consistency of the hardware system, because the atomic computation is sequential to begin with.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call