The effectiveness of SRAM network caches in clustered DSMs

A Moga,M Dubois

doi:10.1109/hpca.1998.650550

Abstract

The frequency of accesses to remote data is a key factor affecting the performance of all Distributed Shared Memory (DSM) systems. Remote data caching is one of the most effective and general techniques to fight processor stalls due to remote capacity misses in the processor caches. The design space of remote data caches (RDC) has many dimensions and one essential performance trade-off hit ratio versus speed. Some recent commercial systems have opted for large and slow (S)DRAM network caches (NC), but others completely avoid them because of their damaging effects on the remote/local latency ratio. In this paper we will explore small and fast SRAM network caches as a means to reduce the remote stalls and capacity traffic of multiprocessor clusters. The major appeal of SRAM NCs is that they add less penalty on the latency of NC hits and remote accesses. Their small capacity can handle conflict misses and a limited amount of capacity misses. However, they can be coupled with main memory page caches which satisfy the bulk of capacity misses. To maximize performance for a large spectrum of applications, we propose to organize the NC as a victim cache for remote data. We also propose a novel and scalable method to control the page cache by integrating page relocation mechanisms into the network victim cache.

Full Text