Abstract
In this paper, a framework is proposed for object recognition and pose estimation from color images using convolutional neural networks (CNNs). 3D object pose estimation along with object recognition has numerous applications, such as robot positioning versus a target object and robotic object grasping. Previous methods addressing this problem relied on both color and depth (RGB-D) images to learn low-dimensional viewpoint descriptors for object pose retrieval. In the proposed method, a novel quaternion-based multi-objective loss function is used, which combines manifold learning and regression to learn 3D pose descriptors and direct 3D object pose estimation, using only color (RGB) images. The 3D object pose can then be obtained either by using the learned descriptors in the nearest neighbor (NN) search or by direct neural network regression. An extensive experimental evaluation has proven that such descriptors provide greater pose estimation accuracy than the state-of-the-art methods. In addition, the learned 3D pose descriptors are almost object-independent and, thus, generalizable to unseen objects. Finally, when the object identity is not of interest, the 3D object pose can be regressed directly from the network, by overriding the NN search, thus significantly reducing the object pose inference time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.