Abstract

Convolutional Neural Networks (CNNs) have become the default paradigm for addressing classification problems, especially, but not only, in image recognition. This is mainly due to their high success rate. Although a number of approaches currently apply deep learning to the 3D shape recognition problem, they are either too slow for online use or too error-prone. To fill this gap, we propose 3DSliceLeNet, a deep learning architecture for point cloud classification. Our proposal converts the input point clouds into a two-dimensional representation by performing a slicing process and projecting the points to the principal planes, thus generating images that are used by the convolutional architecture. 3DSliceLeNet successfully achieves both high accuracy and low computational cost. A dense set of experiments has been conducted to validate our system under the ModelNet challenge, a large-scale 3D Computer Aided Design (CAD) model dataset. Our proposal achieves a success rate of 94.37% and an Area under Curve (AUC) of 0.978 on the ModelNet-10 classification task.

Highlights

  • O BJECT recognition is one of the key problems to be solved for the development of a complete scene understanding system and is the main focus of this work

  • Many papers addressed the problem of 3D shape classification using deep learning techniques and there have been a large number of papers using Convolutional Neural Networks in the field of 3D object recognition [5]–[7]

  • In this work, which is an extension of the doctoral thesis of Dr Francisco Gomez-Donoso [11] and our previous work LonchaNet [12], we present an approach that uses multiple 2D views acquired from 3D models applied to 3D object recognition

Read more

Summary

INTRODUCTION

O BJECT recognition is one of the key problems to be solved for the development of a complete scene understanding system and is the main focus of this work. They achieved a 83.50% accuracy, a quite inconspicuous percentage for today’s standards, but they paved the way for future research Another approach to this problem is introduced by Xu and Todorovic in their work "Beam search for Learning a Deep Convolutional neural Network of 3D Shapes" [17] in which a beam-search for an optimal CNN hyperparameters and architecture is proposed. In light of this literature review, our proposed method presents a novel approach for 3D model recognition which bundles a multi-view object slicing approach, based on Setio et al [40] method for Computed Tomography (CT) images, with a modified version of the GoogLeNet [13] CNN architecture to achieve state-of-the-art performance while keeping computational cost at bay. This allows the network to be adapted and tuned for other datasets

EXPERIMENTS
IMPACT OF THE NUMBER OF SLICES ON THE ACCURACY AND RUNTIME
CONCLUSION
Findings
FUTURE WORK
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call