CaL: Extending Data Locality to Consider Concurrency for Performance Optimization

Yuhang Liu,Xian-He Sun

doi:10.1109/tbdata.2017.2753825

Abstract

Big data applications demand a better memory performance. Data Locality has been the focus of reducing data access delay. Data access concurrency, however, has become prevalent in modern memory systems in recent years. How to extend existing locality-based performance optimization to consider data concurrency becomes a timely issue facing the researchers and practitioners in the field of computing, especially in the field of big data computing. In this study, we introduce the concept and definition of Concurrency-aware data access Locality (CaL), which, as its name states, extends the concept of locality by considering concurrency. Compared to the conventional concept of locality, CaL accurately reflects the combined impact of data access locality and concurrency in modern memory systems and is very effective for data intensive applications. The value of CaL can be quantitatively measured directly by performance counters in mainstream commercial processors and is practically feasible. Two theoretical results are presented to reveal the relationships between CaL and existing memory system performance metrics of memory accesses per cycle (APC), average memory access time (AMAT), and memory bandwidth (B). In this way, we provide a methodology to use existing locality-based optimization methods directly or in combination with data concurrency optimizations, to improve the value of CaL and to improve the performance of a memory system. To demonstrate the practical value of CaL, we conduct four case studies to illustrate the power of concurrency-aware locality optimization. Compared with the conventional locality based optimization, the CaL-aware design has achieved significant performance improvement. It achieved a 3.12-fold speedup on K-means, which is a widely-used data analytic kernel from the big data benchmarks.

Full Text