Abstract

Emerging resistive random-access memory (ReRAM) based processing-in-memory (PIM) accelerators have been increasingly explored in recent years because they can efficiently perform in-situ matrix-vector multiplication (MVM) operations involved in a wide spectrum of artificial neural networks. However, there remain significant challenges to apply existing ReRAM-based PIM accelerators to the most popular Transformer neural networks. Since Transformers involve a series of matrix-matrix multiplication (MatMul) operations with data dependencies, they should write intermediate results of MatMuls to ReRAM crossbar arrays for further processing. Conventional ReRAM-based PIM accelerators often suffer from high latency of ReRAM writes and intra-layer pipeline stalls. In this paper, we propose ReCAT, a ReRAM-based PIM accelerator designed particularly for Transformers. ReCAT exploits transimpedance amplifiers (TIAs) to cascade a pair of crossbar arrays for MatMul operations involved in the self-attention mechanism. The intermediate result of a MatMul generated by one crossbar array can be directly mapped to another crossbar array, avoiding costly analog-to-digital conversions. In this way, ReCAT allows MVM operations to overlap with the corresponding data mapping, hiding the high latency of ReRAM writes. Furthermore, we propose an analog-to-digital converter (ADC) virtualization scheme to dynamically share scarce ADCs among a group of crossbar arrays, and thus significantly improve the utilization of ADCs to eliminate the performance bottleneck of MVM operations. Experimental results show that ReCAT achieves 207.3 ×, 2.11 ×, and 3.06 × performance improvement on average compared with other Transformer acceleration solutions—GPUs, ReBert, and ReTransformer, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.