A 34-FPS 698-GOP/s/W Binarized Deep Neural Network-Based Natural Scene Text Interpretation Accelerator for Mobile Edge Computing

Yixing Li,Wang Ling Goh,Hao Yu,Yongliang Wang,Fengbo Ren,Zichuan Liu,Yu Jiang,Wenye Liu

doi:10.1109/tie.2018.2875643

Abstract

The scene text interpretation is a critical part of the natural scene interpretation. Currently, most of the existing work is based on high-end graphics processing units (GPUs) implementation, which is commonly used on the server side. However, in Internet of Things (IoT) application scenarios, the communication overhead from the edge device to the server is quite large, which sometimes even dominates the total processing time. Hence, the edge-computing oriented design is needed to solve this problem. In this paper, we present an architectural design and implementation of a natural scene text interpretation (NSTI) accelerator, which can classify and localize the text region on pixel-level efficiently in real-time on mobile devices. To target the real-time and low-latency processing, the binary convolutional encoder–decoder network is adopted as the core architecture to enable massive parallelism due to its binary feature. Massively parallelized computations and a highly pipelined data flow control enhance its latency and throughput performance. In addition, all the binarized intermediate results and parameters are stored on chip to eliminate the power consumption and latency overhead of the off-chip communication. The NSTI accelerator is implemented in a 40 nm CMOS technology, which can process scene text images (size of 128 × 32) at 34 fps and latency of 40 ms for pixelwise interpretation with the pixelwise classification accuracy over 90% on ICDAR-03 and ICDAR-13 dataset. The real energy-efficiency is 698 GOP/s/W and the peak energy-efficiency can get up to 7825 GOP/s/W. The proposed accelerator is 7 times more energy efficient than its optimized GPU-based implementation counterpart, while maintaining a real-time throughput with latency of 40 ms.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE transactions on industrial electronics (1982)	Publication Date: Sep 1, 2019
Citations: 40	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

A 34-FPS 698-GOP/s/W Binarized Deep Neural Network-Based Natural Scene Text Interpretation Accelerator for Mobile Edge Computing

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on industrial electronics (1982)

Lead the way for us

Similar Papers

Estimating numerical error in neural network simulations on Graphics Processing Units
James P Turner ... Thomas Nowotny
BMC neuroscience | VOL. 16
James P Turner, et. al.James P Turner ... Thomas Nowotny
01 Dec 2015
BMC neuroscience | VOL. 16

Performance-gain investigation of dynamic foveated rendering technique for virtual reality applications
Supriya Vishal Raul
-
Supriya Vishal RaulSupriya Vishal Raul
01 Jan 2020
01 Jan 2020

Development of GPU based image reconstruction method for clinical SPECT
Hui Liu ... Shi Wang
-
Hui Liu, et. al.Hui Liu ... Shi Wang
01 Oct 2012
01 Oct 2012

Graphics Hardware Acceleration of Particle Swarm Optimization with Digital Pheromones using the CUDA Architecture
Vijay Kalivarapu ... Eliot Winer
-
Vijay Kalivarapu, et. al.Vijay Kalivarapu ... Eliot Winer
11 Sep 2012
11 Sep 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A 34-FPS 698-GOP/s/W Binarized Deep Neural Network-Based Natural Scene Text Interpretation Accelerator for Mobile Edge Computing

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on industrial electronics (1982)