Abstract

As a typical machine-learning based detection technique, deformable part models (DPM) achieve great success in detecting complex object categories. The heavy computational burden of DPM, however, severely restricts their utilization in many real world applications. In this work, we accelerate DPM via parallelization and hypothesis pruning. Firstly, we implement the original DPM approach on a GPU platform and parallelize it, making it 136 times faster than DPM release 5 without loss of detection accuracy. Furthermore, we use a mixture root template as a prefilter for hypothesis pruning, and achieve more than 200 times speedup over DPM release 5, apparently the fastest implementation of DPM yet. The performance of our method has been validated on the Pascal VOC 2007 and INRIA pedestrian datasets, and compared to other state-of-the-art techniques.

Highlights

  • Detecting objects in visual media is important for many computer vision tasks

  • We evaluated a range of deformable part models (DPM) acceleration methods on the Pascal VOC 2007 dataset: (1) DPM release 5 [24], (2) cascade DPM [13], (3) branch-bound DPM [19], (4) FFT DPM [17], (5) coarse-to-fine DPM [16], (6) fastest DPM [20], and our approach

  • We report the average precision (AP) of all 20 object categories in Table 2, where DPM-GPU-P is the proposed method in this work and DPM-C++-omp is the multicore C++ version of DPM release 5

Read more

Summary

Introduction

Detecting objects in visual media is important for many computer vision tasks. To understand an image, object detection usually is the first step. After the templates (filters) have been trained, at detection time, the original DPM evaluates the appearance score densely at every image position and scale by calculating the correlation between filters and feature maps. These two factors result in a heavy computational burden for DPM, but potentially these can be largely relieved. The performance of our GPU version of DPM, which we call DPM-GPU, significantly outperforms other accelerated DPM approaches, achieving over 136 times speedup compared to the original DPM release 5, without accuracy loss This is apparently the most complete attempt to parallelize DPM release 5 on the GPU and evaluate it on challenging public datasets. As far as we know, this is the fastest implementation of DPM yet

Related work
DPM revisited
Mixture root filters guided hypothesis pruning
GPU implementation
GPU optimization
Empirical results
Experimental setting
Grid search for the optimal threshold
Evaluation on Pascal VOC 2007
Evaluation on INRIA pedestrian dataset
Evaluation on plate localization dataset
Conclusions and discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call