Abstract

The paper presents a deep neural network-based method for global and local descriptors extraction from a point cloud acquired by a rotating 3D LiDAR. The descriptors can be used for two-stage 6DoF relocalization. First, a course position is retrieved by finding candidates with the closest global descriptor in the database of geo-tagged point clouds. Then, the 6DoF pose between a query point cloud and a database point cloud is estimated by matching local descriptors and using a robust estimator such as RANSAC. Our method has a simple, fully convolutional architecture based on a sparse voxelized representation. It can efficiently extract a global descriptor and a set of keypoints with local descriptors from large point clouds with tens of thousand points. Our code and pretrained models are publicly available on the project website.

Highlights

  • R ELOCALIZATION at a city-scale is an emerging task with various applications in robotics and autonomous vehicles, such as loop closure in SLAM or kidnapped robot problem [1]

  • We address the problem of a point cloudbased relocalization at a city scale

  • We propose a network architecture to efficiently extract both a global descriptor for coarse-level place recognition and a set of keypoints with their local descriptors for 6DoF pose estimation

Read more

Summary

INTRODUCTION

R ELOCALIZATION at a city-scale is an emerging task with various applications in robotics and autonomous vehicles, such as loop closure in SLAM or kidnapped robot problem [1]. A typical approach is a two-step process: 1) coarse localization using global descriptors, 2) precise 6DoF pose estimation by pairwise registration. Methods operating on such large point clouds typically convert them to some form of intermediate representation, such as a global point cloud descriptor It processes raw 3D point clouds using a sparse voxelized representation and 3D convolutional architecture without the need for a point cloud conversion to an intermediate form. The main contribution of our work is the development of an efficient network architecture for extraction of both a global descriptor for coarse-level place recognition and a set of keypoints with their local descriptors for a precise 6DoF pose estimation. The method efficiently processes raw point clouds acquired by modern 360◦ rotating LiDAR sensors, with tens of thousand points. Set of multi-layer 2D images [5], before further processing

RELATED WORK
EGONN: EGOCENTRIC NEURAL NETWORK FOR
Network Architecture
Network Training
Datasets and Evaluation Methodology
Results and Discussion
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.