Co-ML: a case for Co llaborative ML acceleration using near-data processing

Shaizeen Aga,Mike Ignatowski,Nuwan Jayasena

doi:10.1145/3357526.3357532

Abstract

The growing importance of Machine Learning (ML) has led to a proliferation of accelerator designs that target ML workloads. The majority of these designs focus on accelerating compute-intensive regions of ML workloads such as general matrix multiplications (GEMMs) and convolutions. While this is a legitimate approach, we observe in this work that ML workloads also comprise data-intensive computations that manifest low compute-to-byte ratios and can often contribute considerably to the total execution time. Further, we also observe that, the presence of such computations opens up an exciting opportunity for near-data processing (NDP) architectures as they often provision for higher memory bandwidth that can benefit such computations.Based on the above observations, in this work we make a case for a more collaborative approach to ML acceleration, termed Co-ML, in which memory plays an active role and is responsible for NDP-amenable computations while the compute-intensive computations are executed on the host accelerator as before. We demonstrate how even a relatively simple NDP design can increase performance of data-intensive computations in ML by up to 20×. Further, for a suite of ML workloads we demonstrate that Co-ML can deliver speedups as high as 20% with average speedups of 14%. Finally, we show that with increasing efforts to build better accelerators for compute-intensive computations, these benefits will likely increase.

Full Text