Abstract

Recognition and pose estimation from 3D free-form objects is a key step for autonomous robotic manipulation. Recently, the point pair features (PPF) voting approach has been shown to be effective for simultaneous object recognition and pose estimation. However, the global model descriptor (e.g., PPF and its variants) that contained some unnecessary point pair features decreases the recognition performance and increases computational efficiency. To address this issue, in this paper, we introduce a novel strategy for building a global model descriptor using stably observed point pairs. The stably observed point pairs are calculated from the partial view point clouds which are rendered by the virtual camera from various viewpoints. The global model descriptor is extracted from the stably observed point pairs and then stored in a hash table. Experiments on several datasets show that our proposed method reduces redundant point pair features and achieves better compromise of speed vs accuracy.

Highlights

  • 3D object recognition is a popular topic in computer vision with numerous applications including robotics, biometrics, navigation, remote sensing, medical diagnosis, entertainment, and education

  • 1) We propose a multiview rendering strategy to sample the visible model points; 2) By defining the observability of the model point and the observability of the point pair feature, the stably observable point pair features are extracted from the 3D model; 3) We demonstrate significant performance improvements of our approach compared with original PPF [27] benchmark on several datasets

  • PROPOSED ALGORITHM we present the details of the proposed improvement based on the well-known point pair feature voting approach [27] for 3D object recognition and pose estimation on point cloud data

Read more

Summary

Introduction

Many different methods have been proposed for 3D object recognition and pose estimation [1], such as 3D feature descriptors based methods [2]–[13], template matching. The global feature-based algorithms have efficiency in the aspects of computation time and memory consumption. They are sensitive to occlusion and clutter, and require the objects of interest to be segmented beforehand from the background. Compared with global featurebased algorithms, local feature-based algorithms are more robust to occlusion and clutter [14], [15]. More recently deep learning based methods have been introduced into 3D object recognition and pose estimation [19]–[24] and has good performance in public datasets. Deep learning based methods require massive computational power and a large amount of time to prepare annotated datasets

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call