Abstract

Computing-in-memory (CIM) helps to improve the energy efficiency of computing by reducing data movement. In edge devices, it is necessary for CIM accelerators to support light-weighted on-chip training for adapting the model to environmental changes and ensuring edge data security. However, most of the previous CIM accelerators for edge devices only realize inference but with training performed on cloud. The support for on-chip training will lead to remarkable area cost and serious performance attenuation. In this work, a CIM accelerator based on emerging nonvolatile memory (NVM) is presented with shared-path transpose read and bit-interleaving weight storage for efficient on-chip training in edge devices. The shared-path transpose read employs a new biasing scheme to eliminate the influence of body effect on the transpose read, improving both read margin and speed. The bit-interleaving weight storage splits the multi-bit weights into individual bits which are stored in the array alternately, speeding up the calculation of training process remarkably. For 8-bit inputs and weights, the evaluation in the 28nm process shows that the proposed accelerator achieves 3.34/3.06 TOPS/W energy efficiency for feed-forward/ back-propagation, 4.6X lower computing latency, and reduces at least 20% chip size compared to the baseline design.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.