Abstract

Deep learning approaches to estimating full 3D orientations of objects, in addition to object classes, are limited in their accuracies, due to the difficulty in learning the continuous nature of three-axis orientation variations by regression or classification with sufficient generalization. This paper presents a novel progressive deep learning framework, herein referred to as 3D POCO Net, that offers high accuracy in estimating orientations about three rotational axes yet with efficiency in network complexity. The proposed 3D POCO Net is configured, using four PointNet-based networks for independently representing the object class and three individual axes of rotations. The four independent networks are linked by in-between association subnetworks that are trained to progressively map the global features learned by individual networks one after another for fine-tuning the independent networks. In 3D POCO Net, high accuracy is achieved by combining a high precision classification based on a large number of orientation classes with a regression based on a weighted sum of classification outputs, while high efficiency is maintained by a progressive framework by which a large number of orientation classes are grouped into independent networks linked by association subnetworks. We implemented 3D POCO Net for full three-axis orientation variations and trained it with about 146 million orientation variations augmented from the ModelNet10 dataset. The testing results show that we can achieve an orientation regression error of about 2.5° with about 90% accuracy in object classification for general three-axis orientation estimation and object classification. Furthermore, we demonstrate that a pre-trained 3D POCO Net can serve as an orientation representation platform based on which orientations as well as object classes of partial point clouds from occluded objects are learned in the form of transfer learning.

Highlights

  • Data representation is crucial when dealing with 3D objects

  • To deal with the trade-off, hybrid approaches are introduced such that the strength of deep learning approaches for object detection and recognition and that of conventional vision technologies for high precision orientation estimation are combined [9]

  • The training of 3D POCO Net starts with training of the reference network for object classification based on the reference samples, i.e., the 3D object sub-dataset with the reference orientation as the input

Read more

Summary

Introduction

Data representation is crucial when dealing with 3D objects. As far as data representation for 3D objects is concerned, there are three approaches available currently: (1) multipleShapeNet [6], LineMod [7] and OCID [8], where 3D point clouds are either from objectCAD models or from actual measurements. Data representation is crucial when dealing with 3D objects. Deep learning approaches for 3D object recognition and orientation estimation are focused on relaxing limitations through a trade-off between precision and complexity. To deal with the trade-off, hybrid approaches are introduced such that the strength of deep learning approaches for object detection and recognition and that of conventional vision technologies for high precision orientation estimation are combined [9]. Hybrid approaches seek for precision in orientation estimation at the expense of computational cost associated with conventional vision technologies. Should end-to-end deep learning approaches to orientation estimation be considered, they have to limit the number of orientation classes or the precision in regression to a manageable level [10]

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.