Abstract

Compared with other applications in computer vision, convolutional neural networks (CNNs) have underperformed on pedestrian detection. A breakthrough was made very recently using sophisticated deep CNN (DCNN) models, with a number of handcrafted features or explicit occlusion handling mechanism. In this paper, we show that by reusing the convolutional feature maps of a DCNN model as image features to train an ensemble of boosted decision models, we are able to achieve the best reported accuracy without using specially designed learning algorithms. We empirically identify and disclose important implementation details. We also show that pixel labeling may be simply combined with a detector to boost the detection performance. By adding complementary handcrafted features such as optical flow, the DCNN-based detector can be further improved. We advance the state-of-the-art results by lowering the log-average miss rate from 11.7% to 8.9% on the Caltech data set and from 11.2% to 8.6% on the Inria data set. We also achieve a comparable result to state-of-the-art approaches on the KITTI data set.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call