Abstract

Abstract. With increasing attention for the indoor environment and the development of low-cost RGB-D sensors, indoor RGB-D images are easily acquired. However, scene semantic segmentation is still an open area, which restricts indoor applications. The depth information can help to distinguish the regions which are difficult to be segmented out from the RGB images with similar color or texture in the indoor scenes. How to utilize the depth information is the key problem of semantic segmentation for RGB-D images. In this paper, we propose an Encode-Decoder Fully Convolutional Networks for RGB-D image classification. We use Multiple Kernel Maximum Mean Discrepancy (MK-MMD) as a distance measure to find common and special features of RGB and D images in the network to enhance performance of classification automatically. To explore better methods of applying MMD, we designed two strategies; the first calculates MMD for each feature map, and the other calculates MMD for whole batch features. Based on the result of classification, we use the full connect CRFs for the semantic segmentation. The experimental results show that our method can achieve a good performance on indoor RGB-D image semantic segmentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.