DDAM: D ata D istribution- A ware M apping of CNNs on Processing-In-Memory Systems

Junpeng Wang,Bo Ding,Song Chen,Qi Xu,Yi Kang,Haitao Du

doi:10.1145/3576196

Abstract

Convolution neural networks (CNNs) are widely used algorithms in image processing, natural language processing and many other fields. The large amount of memory access of CNNs is one of the major concerns in CNN accelerator designs that influences the performance and energy-efficiency. With fast and low-cost memory access, Processing-In-Memory (PIM) system is a feasible solution to alleviate the memory concern of CNNs. However, the distributed manner of data storing in PIM systems is in conflict with the large amount of data reuse of CNN layers. Nodes of PIM systems may need to share their data with each other before processing a CNN layer, leading to extra communication overhead. In this article, we propose DDAM to map CNNs onto PIM systems with the communication overhead reduced. Firstly, A data transfer strategy is proposed to deal with the data sharing requirement among PIM nodes by formulating a Traveling-Salesman-Problem (TSP). To improve data locality, a dynamic programming algorithm is proposed to partition the CNN and allocate a number of nodes to each part. Finally, an integer linear programming (ILP)-based mapping algorithm is proposed to map the partitioned CNN onto the PIM system. Experimental results show that compared to the baselines, DDAM can get a higher throughput of 2.0× with the energy cost reduced by 37% on average.

Full Text