Abstract

As pedestrians usually appear up-right in image or video data, we therefore employ a statistical model of the up-right human body where the head, the upper body, and the lower body are treated as three distinct components. As we incorporate different kinds of low-level measurements, the resulting multi-modal & multi-channel Haar-like features represent characteristic differences between parts of the human body yet are robust against variations in clothing or environmental settings. Then we use a Switchable Deep Network(SDN) for pedestrian detection. The SDN automatically learns features of different body parts. Experimental results on many pedestrian datasets show that the proposed algorithm significantly improves the detection rates at 0.1FPPI compared with the state-of-the-art domain adaptation methods and that it is robust and accurate against cluttered dynamical background, occlusion and the object deformation.

Highlights

  • Pedestrian detection is a challenging task of great interest in computer vision

  • Pedestrian detection is an important topic in computer vision [1]

  • We propose a method that marks a middle ground; we design compact, discriminative Haar-like features selected from a particular template pool that reflects prior information about the pedestrian up-right body shape

Read more

Summary

Introduction

Pedestrian detection is a challenging task of great interest in computer vision. Pedestrian detection is an important topic in computer vision [1]. Significant progress has been achieved in recent years [2] This problem is challenging because pedestrian images undergo large variations of visual appearance due to the changes of poses, viewpoints, clothing, lighting, and resolutions. Many pedestrian detectors have been developed to address these challenges They extract manually designed features, such as HOG[3] and Haar-like descriptors [4] or their combinations [5], from images, and employ classifiers such as boosting [6], SVM [3], and structure SVM [7] to decide whether a detection window should be classified as a pedestrian. In order to handle more complex and larger variations, a mixture of templates is learned for each body part [8] Such templates (e.g., poselets [8]) are learned through clustering pose annotations and region appearance. Aspects due to the peculiar geometry of high dimensional spaces, e.g. concentration of measure and neighborliness, appear to be disregarded[10]

The algorithm flow
Informed Haar-like features
Experiments and results analysis
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call