Abstract
Recognition and pose estimation from 3D free-form objects is a key step for autonomous robotic manipulation. Recently, the point pair features (PPF) voting approach has been shown to be effective for simultaneous object recognition and pose estimation. However, the global model descriptor (e.g., PPF and its variants) that contained some unnecessary point pair features decreases the recognition performance and increases computational efficiency. To address this issue, in this paper, we introduce a novel strategy for building a global model descriptor using stably observed point pairs. The stably observed point pairs are calculated from the partial view point clouds which are rendered by the virtual camera from various viewpoints. The global model descriptor is extracted from the stably observed point pairs and then stored in a hash table. Experiments on several datasets show that our proposed method reduces redundant point pair features and achieves better compromise of speed vs accuracy.
Highlights
3D object recognition is a popular topic in computer vision with numerous applications including robotics, biometrics, navigation, remote sensing, medical diagnosis, entertainment, and education
1) We propose a multiview rendering strategy to sample the visible model points; 2) By defining the observability of the model point and the observability of the point pair feature, the stably observable point pair features are extracted from the 3D model; 3) We demonstrate significant performance improvements of our approach compared with original PPF [27] benchmark on several datasets
PROPOSED ALGORITHM we present the details of the proposed improvement based on the well-known point pair feature voting approach [27] for 3D object recognition and pose estimation on point cloud data
Summary
Many different methods have been proposed for 3D object recognition and pose estimation [1], such as 3D feature descriptors based methods [2]–[13], template matching. The global feature-based algorithms have efficiency in the aspects of computation time and memory consumption. They are sensitive to occlusion and clutter, and require the objects of interest to be segmented beforehand from the background. Compared with global featurebased algorithms, local feature-based algorithms are more robust to occlusion and clutter [14], [15]. More recently deep learning based methods have been introduced into 3D object recognition and pose estimation [19]–[24] and has good performance in public datasets. Deep learning based methods require massive computational power and a large amount of time to prepare annotated datasets
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.