STAR: Synthesis of Stateful Logic in RRAM Targeting High Area Utilization

Feng Wang,Hongzhong Zheng,Jinfeng Kang,Dimin Niu,Jiaxi Zhang,Guojie Luo,Yuhao Wang,Guangyu Sun

doi:10.1109/tcad.2020.3015465

Feng Wang, Hongzhong Zheng + Show 6 more

https://doi.org/10.1109/tcad.2020.3015465

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Processing-in-memory (PIM) exploits massive parallelism with high energy efficiency and becomes a promising solution to the von Neumann bottleneck. Recently, the emerging metal-oxide resistive random access memory (RRAM) shows its potential to construct a PIM architecture, because several stateful logic operations, e.g., IMP and NOR, can be executed in an RRAM crossbar in parallel. Previous synthesis flows focus on improving latency with stateful logic operations, but they ignore that the memory should be used primarily for storage. i.e., most of the area in the crossbar is used for computation but not storage. In this situation, storage and computation still have to be separated into different crossbars, which leads to considerable data transfer overhead and limited parallelism. In this work, we define the ratio of storage in a crossbar as area utilization. We aim to improve the area utilization without throughput loss by proposing STAR, a novel synthesis flow for the stateful logic. We present two optimization strategies to reduce the computation area in STAR. First, we reduce the area for redundant inputs. For the shared constants among different rows (or columns), we encode them as immediate values into the control signals without writing them into the crossbar at runtime. For the other inputs, we only store one copy of them in the crossbar. Second, we reduce the area for intermediate variables by reusing invalid cells. And we design a scheduling algorithm to find a computation sequence with the minimal variable erasing cycles. Invalid primary inputs can also be erased in this algorithm. Furthermore, we present a case study of the image convolution to demonstrate the effectiveness of STAR. Experimental evaluation shows that STAR achieves 33.03% more area utilization and a 1.43x throughput compared to SIMPLER, the state-of-the-art stateful logic synthesis flow. Our image convolution implementation also provides 78.36% more area utilization and a 1.48x throughput compared with IMAGING, the state-of-the-art stateful logic-based image processing accelerator.

Full Text