Abstract

Processing-in-Memory (PIM) architectures are an instance of Near-Data Processing (NDP) that promise to bridge the power-and performance-walls caused by the high latency and power costs associated with external memory access. PIM systems can minimize data transfers to and from processor/memory, relegating (parts of) processing to memory. However, the effects of processor (s)/PIM-systems shared access to data on coherency overhead are not yet understood. In this manuscript, we model coherency overhead of shared data access by processor and PIM systems: i.e., shared access to common data. We present an analytical model that quantifies performance in function of degree of interleaving and PIM system latencies. We evaluate our model using simple image processing kernels, and experimentally validate our approach using PIMSIM, an open-source PIM simulator. Results show that our analytical model can predict coherency overhead within 2% of error margin, and we identify several interesting behaviors that warrant further research towards widespread adoption of PIM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call