Abstract

With the development of low-cost RGB-D (Red Green Blue-Depth) sensors, RGB-D object recognition has attracted more and more researchers’ attention in recent years. The deep learning technique has become popular in the field of image analysis and has achieved competitive results. To make full use of the effective identification information in the RGB and depth images, we propose a multi-modal deep neural network and a DS (Dempster Shafer) evidence theory based RGB-D object recognition method. First, the RGB and depth images are preprocessed and two convolutional neural networks are trained, respectively. Next, we perform multi-modal feature learning using the proposed quadruplet samples based objective function to fine-tune the network parameters. Then, two probability classification results are obtained using two sigmoid SVMs (Support Vector Machines) with the learned RGB and depth features. Finally, the DS evidence theory based decision fusion method is used for integrating the two classification results. Compared with other RGB-D object recognition methods, our proposed method adopts two fusion strategies: Multi-modal feature learning and DS decision fusion. Both the discriminative information of each modality and the correlation information between the two modalities are exploited. Extensive experimental results have validated the effectiveness of the proposed method.

Highlights

  • Object recognition is one of the fundamental problems in the fields of computer vision and robotics

  • To meet the requirements of the two convolutional neural network (CNN), which use the basic architecture of AlexNet, the input RGB and depth images are first scaled to 227 × 227

  • Fisher Kernels (CFK) method is used for recognition, which integrates the advantages of CNN and feature extraction, and the HHA encoding method is used for depth images

Read more

Summary

Introduction

Object recognition is one of the fundamental problems in the fields of computer vision and robotics. The features are learned from the RGB and depth images, and the classifiers are used for classification This kind of method performs better, but it still does not make full use of the effective information contained in the RGB-D images. We propose a comprehensive multi-modal objective function, which includes two discriminative terms and one correlation term; and for each modality, an effective weighted trust degree is designed according to the probability outputs of the two SVMs and the learned features. RGB-D image preprocessing, the architecture, and the learning method of the proposed multi-modal feature learning method and the DS evidence theory based RGB-D object recognition method.

Related Work
Proposed Method
RGB-D Image Preprocessing
Method of thewe
Method of channels the Proposed
RGB Feature Learning and Depth Feature Learning
Multi-Modal Feature Learning
Experimental Results
Dataset and Implementation Details object
Objects
Comparasion with Different Baselines
Comparasion with State-of-the-Art Methods
Conclusions
Methods

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.