Abstract

In this paper, we propose the multi-view saliency guided deep neural network (MVSG-DNN) for 3D object retrieval and classification. This method mainly consists of three key modules. First, the module of model projection rendering is employed to capture the multiple views of one 3D object. Second, the module of visual context learning applies the basic Convolutional Neural Networks for visual feature extraction of individual views and then employs the saliency LSTM to adaptively select the representative views based on multi-view context. Finally, with these information, the module of multi-view representation learning can generate the compile 3D object descriptors with the designed classification LSTM for 3D object retrieval and classification. The proposed MVSG-DNN has two main contributions: 1) It can jointly realize the selection of representative views and the similarity measure by fully exploiting multi-view context; 2) It can discover the discriminative structure of multi-view sequence without constraints of specific camera settings. Consequently, it can support flexible 3D object retrieval and classification for real applications by avoiding the required camera settings. Extensive comparison experiments on ModelNet10, ModelNet40, and ShapeNetCore55 demonstrate the superiority of MVSG-DNN against the state-of-art methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.