Object detection via deeply exploiting depth information

Saihui Hou,Feng Wu,Zilei Wang

doi:10.1016/j.neucom.2018.01.055

Abstract

Abstract This paper addresses the issue on how to more effectively coordinate the depth with RGB aiming at boosting the performance of RGB-D object detection. Particularly, we investigate two primary ideas under the CNN model: property derivation and property fusion. Firstly, we propose that the depth can be utilized not only as a type of extra information besides RGB but also to derive more visual properties for comprehensively describing the objects of interest. Then a two-stage learning framework consisting of property derivation and fusion is constructed. Here the properties can be derived either from the provided color/depth or their pairs (e.g. the geometry contour). Secondly, we explore the fusion methods of different properties in feature learning, which is boiled down to, under the CNN model, from which layer the properties should be fused together. The analysis shows that different semantic properties should be learned separately and combined before passing into the final classifier. Actually, such a detection way is in accordance with the mechanism of the primary visual cortex (V1) in brain. We experimentally evaluate the proposed method on the challenging datasets NYUD2 and SUN RGB-D, and both achieve remarkable performances that outperform the baselines.

Full Text