Low-Power RTL Code Generation for Advanced CNN Algorithms toward Object Detection in Autonomous Vehicles

Youngbae Kim,Nandakishor Yadav,Shuai Li,Heekyung Kim,Kyuwon Ken Choi

doi:10.3390/electronics9030478

Youngbae Kim, Nandakishor Yadav + Show 3 more

Open Access

https://doi.org/10.3390/electronics9030478

Copy DOI

Journal: Electronics	Publication Date: Mar 14, 2020
Citations: 13	License type: CC BY 4.0

Affiliation: Illinois Institute of Technology

Abstract

In the implementation process of a convolution neural network (CNN)-based object detection system, the primary issues are power dissipation and limited throughput. Even though we utilize ultra-low power dissipation devices, the dynamic power dissipation issue will be difficult to resolve. During the operation of the CNN algorithm, there are several factors such as the heating problem generated from the massive computational complexity, the bottleneck generated in data transformation and by the limited bandwidth, and the power dissipation generated from redundant data access. This article proposes the low-power techniques, applies them to the CNN accelerator on the FPGA and ASIC design flow, and evaluates them on the Xilinx ZCU-102 FPGA SoC hardware platform and 45 nm technology for ASIC, respectively. Our proposed low-power techniques are applied at the register-transfer-level (RT-level), targeting FPGA and ASIC. In this article, we achieve up to a 53.21% power reduction in the ASIC implementation and saved 32.72% of the dynamic power dissipation in the FPGA implementation. This shows that our RTL low-power schemes have a powerful possibility of dynamic power reduction when applied to the FPGA design flow and ASIC design flow for the implementation of the CNN-based object detection system.

Highlights

Among the machine learning algorithms, the convolutional neural network (CNN) model is one of the popular architectures and keywords at present
With the growing usage of CNN-based Internet of Thing (IoT) products, including autonomous vehicles, companies are developing and releasing various sizes of customized chips to support the massive amount of CNN computational processes, such as the Tensor Processing Unit (TPU), deep learning processing unit (DPU), holographic processing unit (HPU), image processing unit (IPU), neural network processing unit (NPU), and vision processing unit (VPU) [13]
We demonstrated our proposed technology through the CNN accelerator, which consumed the most power in the CNN architecture

Summary

Introduction

Among the machine learning algorithms, the convolutional neural network (CNN) model is one of the popular architectures and keywords at present. Electronics 2020, 9, 478 which can process the larger amount of data simultaneously or in parallel This increased hardware capability causes a large amount of power consumption inevitably. With the growing usage of CNN-based Internet of Thing (IoT) products, including autonomous vehicles, companies are developing and releasing various sizes of customized chips to support the massive amount of CNN computational processes, such as the Tensor Processing Unit (TPU), deep learning processing unit (DPU), holographic processing unit (HPU), image processing unit (IPU), neural network processing unit (NPU), and vision processing unit (VPU) [13]. For the algorithm model aspect, to achieve high-performance and high-throughput results, most researchers and developers have suggested novel CNN architectures and efficient memory structures to decrease the processing time and improve the parallel computing performance so that they can increase the power efficiency [15,16,17,18]. We evaluate the power consumption through a reliable experimental environment including an FPGA platform and ASIC design flow

Background

Clock Gating

Local Explicit Clock Enable

Local Explicit Clock Gating

Bus-Specific Clock Gating

Enhanced Clock Gating

Memory Split

Proposed CNN Accelerator

Practical Application of the Industrial CNN Accelerator

Experiment Results

Testing Environment

FPGA Implementation Result

ASIC Implementation Result

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Low-Power RTL Code Generation for Advanced CNN Algorithms toward Object Detection in Autonomous Vehicles

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Computed Tomography Image Based on Intelligent Segmentation Algorithm in the Diagnosis of Ovarian Tumor
Ling Zhu ... Gustavo Ramirez
Scientific programming | VOL. 2021
Ling Zhu, et. al.Ling Zhu ... Gustavo Ramirez
13 Nov 2021
Scientific programming | VOL. 2021

Application of CNN Algorithm Based on Chaotic Recursive Diagonal Model in Medical Image Processing.
Fangfang Ye ... Sen Xu
Computational Intelligence and Neuroscience | VOL. 2021
Fangfang Ye, et. al.Fangfang Ye ... Sen Xu
08 Sep 2021
Computational Intelligence and Neuroscience | VOL. 2021

Single-Photon Emission Computed Tomography Image-Assisted Diagnosis of Thyroid Diseases under Convolutional Network Neural Algorithm
Shaobo Chen ... Huaijun Wang
Scientific programming | VOL. 2021
Shaobo Chen, et. al.Shaobo Chen ... Huaijun Wang
16 Dec 2021
Scientific programming | VOL. 2021

ECT Attention Reverse Mapping algorithm: visualization of flow pattern heatmap based on convolutional neural network and its impact on ECT image reconstruction
Zhuoqun Xu ... Fan Wu
Measurement Science and Technology | VOL. 32
Zhuoqun Xu, et. al.Zhuoqun Xu ... Fan Wu
16 Dec 2020
Measurement Science and Technology | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Low-Power RTL Code Generation for Advanced CNN Algorithms toward Object Detection in Autonomous Vehicles

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics