Design of Power-Efficient Training Accelerator for Convolution Neural Networks

Jiun Hong,Taegeon Lee,Saad Arslan,Hyungwon Kim

doi:10.3390/electronics10070787

Jiun Hong, Taegeon Lee + Show 2 more

Open Access

https://doi.org/10.3390/electronics10070787

Copy DOI

Abstract

To realize deep learning techniques, a type of deep neural network (DNN) called a convolutional neural networks (CNN) is among the most widely used models aimed at image recognition applications. However, there is growing demand for light-weight and low-power neural network accelerators, not only for inference but also for training process. In this paper, we propose a training accelerator that provides low power and compact chip size targeted for mobile and edge computing applications. It accelerates to achieve the real-time processing of both inference and training using concurrent floating-point data paths. The proposed accelerator can be externally controlled and employs resource sharing and an integrated convolution-pooling block to achieve low area and low energy consumption. We implemented the proposed training accelerator in an FPGA (Field Programmable Gate Array) and evaluated its training performance using an MNIST CNN example in comparison with a PC with GPU (Graphics Processing Unit). While both methods achieved a similar training accuracy of 95.1%, the proposed accelerator, when implemented in a silicon chip, reduced the energy consumption by 480 times compared to the counterpart. Additionally, when implemented on an FPGA, an energy reduction of over 4.5 times was achieved compared to the existing FPGA training accelerator for the MNIST dataset. Therefore, the proposed accelerator is more suitable for deployment in mobile/edge nodes compared to the existing software and hardware accelerators.

Highlights

Deep learning is a type of machine learning based on artificial neural networks
An artificial neural network (ANN) is a neural network whose structure is modeled based on the human brain
A convolution neural network (CNN) extends the structure of an ANN by employing convolutional filters and feature map compression layers called pooling to reduce the need for a large number of weights

Summary

Introduction

Deep learning is a type of machine learning based on artificial neural networks. An artificial neural network (ANN) is a neural network whose structure is modeled based on the human brain. Despite the growing need for fast/low-power solutions for training of deep neural networks (DNNs) on mobile devices [22], only real-time inference solutions using dedicated hardware units and processors are common. This is because the computing and memory resources are very limited to train deep learning models on mobile devices. 2. We present a design and verification methodology for the accelerator architecture and chip implementation by converting a high-level CNN to a hardware structure and comparing the computation results of the hardware against the model.

Background

Training in Convolutional Neural Networks

Fully Connected Layer

Backward B0ackPwoaorlidng P1oolriensgult result

11 Pooling6 1130 12

Softmax and Cross Entropy Error

Combined Convolution and Pooling Operation

Operation of Training and Inference Mode

Resource Sharing

Implementation and Evaluation

Findings

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Mar 26, 2021
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Design of Power-Efficient Training Accelerator for Convolution Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

A Survey on Convolutional Neural Network Accelerators: GPU, FPGA and ASIC
Yunxiang Hu ... Yuhao Liu
-
Yunxiang Hu, et. al.Yunxiang Hu ... Yuhao Liu
07 Jan 2022
07 Jan 2022

Introduction to hardware accelerator systems for artificial intelligence and machine learning
Neha Gupta
-
Neha GuptaNeha Gupta
14 Sep 2020
14 Sep 2020

FPGA-Based Accelerators of Deep Learning Networks for Learning and Classification: A Review
Ahmad Shawahna ... Sadiq M Sait
IEEE Access | VOL. 7
Ahmad Shawahna, et. al.Ahmad Shawahna ... Sadiq M Sait
01 Jan 2019
IEEE Access | VOL. 7

FPGA Implementation of Deep Leaning Model for Video Analytics
Khuram Nawaz Khayam ... Hassan Nazeer Chaudhry
Computers, Materials & Continua | VOL. 71
Khuram Nawaz Khayam, et. al.Khuram Nawaz Khayam ... Hassan Nazeer Chaudhry
01 Jan 2021
Computers, Materials & Continua | VOL. 71

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Design of Power-Efficient Training Accelerator for Convolution Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics