Abstract

We propose a novel ultra-low-power, voltage-based compute-in-memory (CIM) design with a new single-ended 8T SRAM bit cell structure. Since the proposed SRAM bit cell uses a single bitline for CIM calculation with decoupled read and write operations, it supports a much higher energy efficiency. In addition, to separate read and write operations, the stack structure of the read unit minimizes leakage power consumption. Moreover, the proposed bit cell structure provides better read and write stability due to the isolated read path, write path and greater pull-up ratio. Compared to the state-of-the-art SRAM-CIM, our proposed SRAM-CIM does not require extra transistors for CIM vector-matrix multiplication. We implemented a 16 k (128 × 128) bit cell array for the computation of 128× neurons, and used 64× binary inputs (0 or 1) and 64 × 128 binary weights (−1 or +1) values for the binary neural networks (BNNs). Each row of the bit cell array corresponding to a single neuron consists of a total of 128 cells, 64× cells for dot-product and 64× replicas cells for ADC reference. Additionally, 64× replica cells consist of 32× cells for ADC reference and 32× cells for offset calibration. We used a row-by-row ADC for the quantized outputs of each neuron, which supports 1–7 bits of output for each neuron. The ADC uses the sweeping method using 32× duplicate bit cells, and the sweep cycle is set to 2N−1+1, where N is the number of output bits. The simulation is performed at room temperature (27 °C) using 45 nm technology via Synopsys Hspice, and all transistors in bitcells use the minimum size considering the area, power, and speed. The proposed SRAM-CIM has reduced power consumption for vector-matrix multiplication by 99.96% compared to the existing state-of-the-art SRAM-CIM. Furthermore, because of the decoupled reading unit from an internal node of latch, there is no feedback from the reading unit, with read static noise, and margin-free results.

Highlights

  • As AI models have become increasingly complex to improve accuracy, the hardware that supports them is becoming heavier and more complex [1]

  • Various next-generation memories such as Resistive RAM (RRAM) [7,8,9,10] and Magnetoresistive RAM (MRAM) [11,12] are emerging; as shown in Table 1, their speed is still lagging behind SRAM [13,14]

  • Calculations, the power consumption has been reduced by up to 99.96% compared to the state-of-the-art SRAM-CIM

Read more

Summary

Introduction

As AI models have become increasingly complex to improve accuracy, the hardware that supports them is becoming heavier and more complex [1] Such complex and heavy hardware faces various limitations, such as increased power consumption and reduced processing speed due to high throughput. In CIM design, the processing speed of memory for AI calculation is an important factor that cannot be ignored, as well as low power To meet these requirements, various next-generation memories such as Resistive RAM (RRAM) [7,8,9,10] and Magnetoresistive RAM (MRAM) [11,12] are emerging; as shown, their speed is still lagging behind SRAM [13,14]. Our proposed structure can be supported to most sizes, including 256 × 256 or 512 × 512

Proposed Compute-in-Memory Design
A Column-Based Neuron Design for BNN
Findings
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.