16-Bit Fixed-Point Number Multiplication With CNT Transistor Dot-Product Engine

Sungho Kim,Hee-Dong Kim,Yongwoo Lee,Sung-Jin Choi

doi:10.1109/access.2020.3009637

Sungho Kim, Hee-Dong Kim + Show 2 more

Open Access

https://doi.org/10.1109/access.2020.3009637

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 4	License type: CC BY 4.0

Affiliation: Sejong University, Kookmin University

Abstract

Resistive crossbar arrays can carry out energy-efficient vector-matrix multiplication, which is a crucial operation in most machine learning applications. However, practical computing tasks that require high precision remain challenging to implement in such arrays because of intrinsic device variability. Herein, we experimentally demonstrate a precision-extension technique whereby high precision can be attained through the combined operation of multiple devices, each of which stores a portion of the required bit width. Additionally, designed analog-to-digital converters are used to remove the unpredictable effects from noise sources. An 8 × 15 carbon nanotube transistor array can perform multiplication operation, where operands have up to 16 valid bits, without any error, making in-memory computing approaches attractive for high-throughput energy-efficient machine learning accelerators.

Highlights

Advancements in deep neural networks for machine learning have increased the demand for high-throughput energyefficient hardware accelerators
We experimentally demonstrated vector–matrix multiplication (VMM) operation in a carbon nanotube (CNT) transistor crossbar array [15], where high precision was attained by leveraging the quantization process of an analog-to-digital circuit (ADC)
Ideally (Figure 1b), VMM computations can be performed in a resistive crossbar array by applying an input voltage signal (Vin) to the rows, and the collected current signal (Iout ) along the column reflects the summation of the results obtained by multiplying the input voltage with the device conductance (G) according to Kirchhoff’s current law

Summary

Introduction

Advancements in deep neural networks for machine learning have increased the demand for high-throughput energyefficient hardware accelerators. Novel approaches using multiple devices or mixed-precision architectures have been developed to mitigate device non-idealities, enabling the use of resistive crossbar arrays (generally called dot-product engines (DPEs) [8], The associate editor coordinating the review of this manuscript and approving it for publication was Liangtian Wan. shown in Figure 1a), in practical computing tasks with acceptable accuracy, such as in differential and matrix equation solvers [7], [9]. Training with numbers having less than 16 bits remains challenging because of the difficulty in maintaining the fidelity of gradient computations during back-propagation [11]. In this case, most previous DPEs cannot provide sufficient precision for deep learning because of precision-related issues e.g., series resistances of wires, sneaky path currents, and/or other noise sources. The intrinsic variability of resistive devices, i.e., cycle-to-cycle and device-to-device variations in their conductance modulation, inevitably deteriorates the VMM accuracy [12]–[14]

Objectives

Results

Conclusion