Evaluation and Acceleration of High-Throughput Fixed-Point Object Detection on FPGAs

Xiaoyin Ma Xiaoyin Ma,Walid A Najjar,Amit K Roy-Chowdhury

doi:10.1109/tcsvt.2014.2360030

Abstract

Reliance on object or people detection is rapidly growing beyond surveillance to industrial and social applications. The histogram of oriented gradients (HOG), one of the most popular object detection algorithms, achieves high detection accuracy but delivers just under 1 frame/s on a high-end CPU. Field-programmable gate array (FPGA) accelerations of this algorithm are limited by the intensive floating-point computations. All current fixed-point HOG implementations use large bit width to maintain detection accuracy, or perform poorly at reduced data precision. In this paper, we introduce the full-image evaluation methodology to explore the FPGA implementation of HOG using reduced bit width. This approach lessens the required area resources on the FPGA, and increases the clock frequency and hence the throughput per device through increased parallelism. We evaluate the detection accuracy of the fixed-point HOG by applying state-of-the-art computer vision pedestrian detection evaluation metrics and show it performs as well as the original floating-point code from OpenCV. We then show our single FPGA implementation achieves a $68.7 \times $ higher throughput than a high-end CPU, $5.1 \times $ higher than a high-end graphics processing unit (GPU), and $7.8 \times $ higher than the same implementation using floating-point on the same FPGA. A power consumption comparison for different platforms shows our fixed-point FPGA implementation uses $130 \times $ less power than CPU, and $31 \times $ less energy than GPU to process one image.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society	Publication Date: Jun 1, 2015
Citations: 83	License type: other-oa

R Discovery Prime

R Discovery Prime

Evaluation and Acceleration of High-Throughput Fixed-Point Object Detection on FPGAs

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society

Lead the way for us

Similar Papers

FPGA, GPU, and CPU implementations of Jacobi algorithm for eigenanalysis
Mustafa U Torun ... Ali N Akansu
Journal of Parallel and Distributed Computing | VOL. 96
Mustafa U Torun, et. al.Mustafa U Torun ... Ali N Akansu
31 May 2016
Journal of Parallel and Distributed Computing | VOL. 96

Highly Parameterized K-means Clustering on FPGAs: Comparative Results with GPPs and GPUs
Hanaa M Hussain ... Ahmet T Erdogan
-
Hanaa M Hussain, et. al.Hanaa M Hussain ... Ahmet T Erdogan
01 Nov 2011
01 Nov 2011

FPGA Implementation of an Evolving Spiking Neural Network
... Snjezana Soltic
-
, et. al. ... Snjezana Soltic
01 Jan 2009
01 Jan 2009

A Hardware-Efficient HOG-SVM Algorithm and its FPGA Implementation
Pengcheng Dai ... Yue Yu
-
Pengcheng Dai, et. al.Pengcheng Dai ... Yue Yu
01 Aug 2021
01 Aug 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation and Acceleration of High-Throughput Fixed-Point Object Detection on FPGAs

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society