Abstract

We present a 256 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\times$ </tex-math></inline-formula> 256 in-memory compute (IMC) core designed and fabricated in 14-nm CMOS technology with backend-integrated multi-level phase change memory (PCM). It comprises 256 linearized current-controlled oscillator (CCO)-based A/D converters (ADCs) at a compact 4- <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mu \text{m}$ </tex-math></inline-formula> pitch and a local digital processing unit (LDPU) performing affine scaling and ReLU operations. A frequency-linearization technique for CCO is introduced, which increases the maximum CCO frequency beyond 3 GHz, while ensuring accurate on-chip matrix–vector multiplications (MVMs). Moreover, the design and functionality of the digital ADC calibration procedure is described in detail and the MVM accuracy is quantified. Finally, the measured classification accuracies of deep learning (DL) inference applications on the MNIST and CIFAR-10 datasets, when two IMC cores are employed, are presented. For a performance density of 1.59 TOPS/mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> , a measured energy efficiency of 10.5 TOPS/W, at a main clock frequency of 1 GHz, is achieved.

Highlights

  • I N-MEMORY computing (IMC) is an emerging non-von Neumann paradigm where computation is performed in the memory array itself [1], [2]

  • In order to accelerate the execution of matrix–vector multiplications (MVMs) operations using IMC, the memory system must be repurposed into a single instruction multiple data (SIMD) array of processing elements [4], where the input vector data is broadcasted across the matrix rows and the various partial products are summed up along a column

  • We propose a hardware-centric approach to characterize the quality of the analog MVM operation

Read more

Summary

INTRODUCTION

I N-MEMORY computing (IMC) is an emerging non-von Neumann paradigm where computation is performed in the memory array itself [1], [2]. Voltage-based A/D converters (ADCs) are mostly used [31] that require a voltage to current conversion, usually employing a large capacitor for integration [23], [32] This has far hampered the realization of large fully-parallel on-chip MVM operations at true O(1) complexity. It comprises compact, low-latency, and energy-efficient current-controlled oscillator (CCO)-based ADCs, digital readout blocks, and a local digital processing unit (LDPU) performing affine scaling and ReLU operations. It presents a comparison of the proposed PCMbased core with other state-of-the-art IMC designs.

UNIT CELL AND ARRAY DESIGN
LINEARIZED CCO-BASED ADC
CCO-Based ADC Structure
Linearization Technique
Counter With Variable Increment Size
ADC Calibration Procedure
MVM OPERATION
Multi-Bit and Bit-Serial Input Modulation
APPLICATIONS AND RESULTS
Hardware-Aware Training
Hardware Experiment
System-Level Performance
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call