Abstract

Computing-In-memory (CIM) accelerators have the characteristics of storage and computing integration, which has the potential to break through the limit of Moore’s law and the bottleneck of Von-Neumann architecture for convolutional neural networks (CNN) implementation improvement. However, the performance of CIM accelerators is still limited by conventional CNN architectures and inefficient readouts. To increase energy-efficient performance, an optimized CNN model is required and a low-power column parallel readout is necessary for edge-computing hardware. In this work, an ReRAM-based CNN accelerator is designed. Mixed-bit operations from 1 bit to 8 bits are supported by an effective bitwidth configuration scheme to implement Neural Architecture Search (NAS)-optimized layer-wise multi-bit CNNs. Besides, column-parallel readout is achieved with excellent energy-efficient performance by a variation-reduction accumulation mechanism and low-power readout circuits. Additionally, we further explore systolic data reuse in an ReRAM-based PE array. Experiments are implemented on NAS-optimized ResNet-18. Benchmarks show that the proposed ReRAM accelerator can achieve peak energy efficiency of 2490.32 TOPS/W for 1-bit operation and average energy efficiency of 479.37 TOPS/W for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$1\sim 8$ </tex-math></inline-formula> -bit operations with evaluating NAS-optimized multi-bitwidth CNNs. When compared with the state-of-the-art works, the proposed accelerator shows at least <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$14.18{\times }$ </tex-math></inline-formula> improvement on energy efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call