Abstract

Recent development in depth imaging technology makes acquisition of depth information easier. With the additional depth cue, RGB-D cameras can provide effective support for many RGB-D perception tasks beyond traditional RGB information. However, current feature representation based on RGB-D images utilizes depth information only to extract local features, without considering it to improve robustness and discriminability of the feature representation by merging depth cues into feature pooling. Spatial pyramid model (SPM) has become the standard protocol to split a 2D image plane into sub-regions for feature pooling of RGB-D images. We argue that SPM may not be the optimal pooling scheme for RGB-D images, as it only pools features spatially and completely discards their depth topological structures. Instead, we propose a novel joint spatial-depth pooling (JSDP) scheme which further partitions SPM using the depth cue and pools features simultaneously in 2D image plane and along the depth direction. By combining the JSDP with standard feature extraction and feature encoding modules, we outperform state-of-the-art methods on benchmarks for RGB-D object classification, detection and scene recognition.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.