An Overview of Energy-Efficient Hardware Accelerators for On-Device Deep-Neural-Network Training

Jinsu Lee,Hoi-Jun Yoo

doi:10.1109/ojsscs.2021.3119554

Abstract

Deep Neural Networks (DNNs) have been widely used in various artificial intelligence (AI) applications due to their overwhelming performance. Furthermore, recently, several algorithms have been reported that require on-device training to deliver higher performance in real-world environments and protect users’ personal data. However, edge/mobile devices contain only limited computation capability with battery power, so an energy-efficient DNN training processor is necessary to realize on-device training. Although there are a lot of surveys on energy-efficient DNN inference hardware, the training is quite different from the inference. Therefore, analysis and optimization techniques targeting DNN training are required. This article aims to provide an overview of energy-efficient DNN processing that enables on-device training. Specifically, it will provide hardware optimization techniques to overcomes the design challenges in terms of distinct dataflow, external memory access, and computation. In addition, this paper summarizes key schemes of recent energy-efficient DNN training ASICs. Moreover, we will also show a design example of DNN training ASIC with energy-efficient optimization techniques.

Highlights

D EEP neural network (DNN) [1] has been widely studied across all technology domains due to the superior accuracy in various applications such as computer vision [2]–[11], natural language processing (NLP) [12], [13], and autonomous system [14]–[16]
DNN training requires a significant amount of operations, so user’s edge/mobile devices had provided only inference with downloaded DNN parameters which pre-trained on cloud servers
References [17]–[19] proposed DNN training scheme using private dataset stored on user devices

Summary

INTRODUCTION

D EEP neural network (DNN) [1] has been widely studied across all technology domains due to the superior accuracy in various applications such as computer vision [2]–[11], natural language processing (NLP) [12], [13], and autonomous system [14]–[16]. DNN training iteratively processes three distinct steps to find high accuracy model parameters, inducing a large number of operations, external memory access, and various dataflow. It makes the realization of ondevice DNN training very challenging since edge devices contain only limited computation capability with battery power. There are many studies about high energy-efficient DNN inference hardware optimizing memory access and computation, DNN training is quite different from DNN inference. We analyze three design challenges for energy-efficient DNN training: dataflow, external memory access, and computation. We introduce the optimization techniques of recent DNN training hardware and describe an example of an energy-efficient DNN training ASIC design.

HW DESIGN CHALLENGES FOR DNN TRAINING

DNN TRAINING ASIC DESIGN EXAMPLE

SUMMARY OF DNN TRAINING ASICS

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Open Journal of the Solid-State Circuits Society	Publication Date: Jan 1, 2021
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Overview of Energy-Efficient Hardware Accelerators for On-Device Deep-Neural-Network Training

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Open Journal of the Solid-State Circuits Society

Lead the way for us

Similar Papers

HNPU: An Adaptive DNN Training Processor Utilizing Stochastic Dynamic Fixed-Point and Active Bit-Precision Searching
Donghyeon Han ... Dongseok Im
IEEE Journal of Solid-State Circuits | VOL. 56
Donghyeon Han, et. al.Donghyeon Han ... Dongseok Im
24 Mar 2021
IEEE Journal of Solid-State Circuits | VOL. 56

PCM: Precision-Controlled Memory System for Energy Efficient Deep Neural Network Training
Boyeal Kim ... Jun Won Choi
-
Boyeal Kim, et. al.Boyeal Kim ... Jun Won Choi
01 Mar 2020
01 Mar 2020

Neuroevolution in Deep Neural Networks: Current Trends and Future Challenges
Edgar Galvan ... Peter Mooney
IEEE Transactions on Artificial Intelligence | VOL. 2
Edgar Galvan, et. al.Edgar Galvan ... Peter Mooney
04 May 2021
IEEE Transactions on Artificial Intelligence | VOL. 2

A 1.32 TOPS/W Energy Efficient Deep Neural Network Learning Processor with Direct Feedback Alignment based Heterogeneous Core Architecture
Donghyeon Han ... Jinmook Lee
-
Donghyeon Han, et. al.Donghyeon Han ... Jinmook Lee
01 Jun 2019
01 Jun 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Overview of Energy-Efficient Hardware Accelerators for On-Device Deep-Neural-Network Training

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Open Journal of the Solid-State Circuits Society