Abstract
View-based 3D shape classification is widely used in machine vision, information retrieval and other fields. However, there are two problems in current methods. First, current 3D shape classifiers fail to make good use of pose information of 3D shapes. Secondly, many views are required to obtain good classification accuracy, which leads to low efficiency. In order to solve these problems, we propose a novel 3D shape classification method based on Convolutional Neural Network (CNN). In the training stage, this method first learns a CNN to extract features, and then uses features of views from different viewpoint groups to train six 3D shape classifiers which fully mine the pose information of 3D shapes. Meanwhile, an additional class is adopted to improve the discrimination of 3D shape classifiers. In the recognition stage, the weighted fusion of image clarity evaluation functions is used to select the most representative view for the 3D shape recognition. Experiments on the ModelNet10 and ModelNet40 show that the classification accuracy of the proposed method can reach up to 91.18% and 89.01% when only using a single view and the efficiency is improved substantially.
Highlights
3D shape classification is a fundamental issue in the field of computer graphics and computer vision [1]
An additional class is adopted to improve the discrimination of 3D shape classifiers
The reasons are: (1) this method can represent a 3D shape from all positions and angles; (2) the 2D views generated under different viewpoint groups are very different, and it is easy to use the pose information to classify the 3D shape
Summary
The primary issue of 3D shape classification is to extract feature descriptors to effectively represent 3D shapes. In the LFD, a 3D shape is projected to generate 100 views This descriptor represents 3D shapes better than other descriptors, but its time complexity is heavy because the view number used for classification is large. These methods which combine the multiple views and deep learning have achieved good performance. Wang et al propose a view clustering and pooling layer based on dominant sets for 3D object recognition This method uses a fast approximate learning strategy for cluster-pooling CNN, and greatly improves its training efficiency with only a slight accuracy reduction [8]. The panoramic view is one view, its size is equivalent to that of multiple views, so the computational complexity is still high
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.