Abstract

The advancement of low-cost RGB-D and LiDAR three-dimensional (3D) sensors has permitted the obtainment of the 3D model easier in real-time. However, making intricate 3D features is crucial for the advancement of 3D object classifications. The existing volumetric voxel-based CNN approaches have achieved remarkable progress, but they generate huge computational overhead that limits the extraction of global features at higher resolutions of 3D objects. In this paper, a low-cost 3D volumetric deep convolutional neural network is proposed for 3D object classification based on joint multiscale hierarchical and subvolume supervised learning strategies. Our proposed deep neural network inputs 3D data, which are preprocessed by implementing memory-efficient octree representation, and we propose to limit the full layer octree depth to a certain level based on the predefined input volume resolution for storing high-precision contour features. Multiscale features are concatenated from multilevel octree depths inside the network, aiming to adaptively generate high-level global features. The strategy of the subvolume supervision approach is to train the network on subparts of the 3D object in order to learn local features. Our framework has been evaluated with two publicly available 3D repositories. Experimental results demonstrate the effectiveness of our proposed method where the classification accuracy is improved in comparison to existing volumetric approaches, and the memory consumption ratio and run-time are significantly reduced.

Highlights

  • Academic Editor: Daniele Bibbo e advancement of low-cost RGB-D and LiDAR three-dimensional (3D) sensors has permitted the obtainment of the 3D model easier in real-time

  • Voxels are commonly used to exploit the 3D model in volumetric representation and directly fed to convolutional neural network (CNN). 3DShapeNets [15] are the first volumetric CNNs proposed by Wu et al, and they have released a rich 3D CAD model repository of ModelNet datasets [6]. 3DShapeNets adopted Convolution Deep Belief Net (CDBN) from 2D Deep learning (DL) to 3D distribution and were applied for three different applications, including 3D object classifications, view predictions, and retrieval tasks

  • We presented a novel point-octree-based volumetric deep convolutional neural network for 3D object classification

Read more

Summary

Related Work

From the motivational results achieved by the DL methods [2, 3, 31] in the 2D field, recently, DL has attracted the attention of the 3D computer vision community in order to learn the complex structures of 3D data. Point cloud CNNs are becoming popular as they can input point cloud data directly, and a lot of work has been done in multiple computer vision tasks (e.g., object recognition and 3D reconstruction) [49,50,51] In this domain, PointNet [50], the pioneer deep net framework, was trained on unordered point sets in 3D environments to perform classification and segmentation tasks. Riegler et al [23] proposed a novel octree representation to produce high-resolution volumetric features to feed their proposed OctNet, which was applied for 3D classification, orientation estimation, and segmentation tasks. E OctNet inputs high-resolution geometric 3D volumetric data tensor consuming lower memory space, and the classification accuracy on low-scale ModelNet dataset improved significantly compared with other volumetric methods. Our MS-VDCNN outperforms the OctNet on ModelNet [6] datasets

Methods
Method
Findings
Conclusions and Future Works
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call