Abstract

Multispectral pedestrian detection is a difficult task, especially with pedestrian images of different sizes. In most convolutional neural network (CNN) models, the shared receptive fields of each layer are of the same size, which constrains detection results of multiple scales pedestrians. In this paper, we propose a dynamic selection scheme to adaptive adjust receptive field size in multispectral pedestrian detection. Specifically, a network in network (NIN) is employed to combine visible and thermal information. Selective kernel networks (SKNets) which uses selective kernel unit with different kernel size are employed. To effectively fuse the feature representation in each layer, a build block is designed, in which different features are fused. Feature pyramid is employed to integrate feature information in each layer. We empirically show that our method outperforms existing 8 state-of-the-art methods on one multispectral dataset and 4 state-of-the-art methods on another multispectral dataset. Detailed analyses show that our proposed method can capture multispectral pedestrian detection of different scales, which confirms the effective of SKNets for adaptively resizing the receptive field sizes. In addition, our method can operate at 14 frames per second (fps) on a GPU.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.