Abstract

Although excessive proposals using traditional sliding-window methods or prevailing anchor-based techniques have been proposed to deal with deep learning-based pedestrian detection, it is still a promising yet challenging problem. In this paper, we propose a precise, flexible and thoroughly anchor-free, as well as proposal-free framework named Pedestrian-as-Points Network (PP-Net) for pedestrian detection. Specifically, we model a pedestrian as a single point, i.e., the center point of the instance, and predict the pedestrian scale at each detected center point. In order to achieve higher accuracy, we build a pyramid-like structure based on the backbone as a feature extractor to aggregate multi-level information. In addition, we construct a deep guidance module (DGM) at the top of the backbone, so that the higher-level information can be captured in the process of building a feature pyramid network (FPN) to avoid the dilution of high-level information on the top-down pathway. We further design a feature fusion unit (FFU) to fuse the fine-level features well with the coarse-level semantic information from the top-down pathway. With the only post-processing non-maximum suppression (NMS), we achieve better performance than many state-of-the-arts methods on the challenging pedestrian detection datasets.

Highlights

  • Deep neural networks (DNNs) based on the fully convolutional neural network have showed great improvements over systems relying on hand-crafted features [1]–[3] on benchmark tasks

  • (3) We develop a novel and unique framework called Pedestrian-as-Points Network (PP-Net) for real-time pedestrian detection, which can effectively utilize the semantic information of images at low resolution along with details at high-resolution

  • Based on the above knowledge, we propose a pyramid-like network named Pedestrian-as-Points Network (PP-Net) as illustrated in Fig. 1, which has two complementary modules that can detect the exact positions of pedestrians and simultaneously predict their sizes

Read more

Summary

Introduction

Deep neural networks (DNNs) based on the fully convolutional neural network have showed great improvements over systems relying on hand-crafted features [1]–[3] on benchmark tasks. With the rapid progress in DNNs research in recent years, it has dramatically facilitated the development of computer vision, such as object detection [4]–[6], image retrieval [7]–[9], scene recognition [10], [11], semantic segmentation [12]–[14], image classification and inpainting [15], [16], and so on. Pedestrians are one of main participants in the public transportation system, so pedestrian detection helps to realize an efficient and safe system. In the past few years, the widely-used anchor-based methods [25]–[29] have been dominant and have achieved tremendous progress.

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call