Abstract

The use of human-level semantic information to aid robotic tasks has recently become an important area for both Computer Vision and Robotics. This has been enabled by advances in Deep Learning that allow consistent and robust semantic understanding. Leveraging this semantic vision of the world has allowed human-level understanding to naturally emerge from many different approaches. Particularly, the use of semantic information to aid in localisation and reconstruction has been at the forefront of both fields. Like robots, humans also require the ability to localise within a structure. To aid this, humans have designed high-level semantic maps of our structures called floorplans. We are extremely good at localising in them, even with limited access to the depth information used by robots. This is because we focus on the distribution of semantic elements, rather than geometric ones. Evidence of this is that humans are normally able to localise in a floorplan that has not been scaled properly. In order to grant this ability to robots, it is necessary to use localisation approaches that leverage the same semantic information humans use. In this paper, we present a novel method for semantically enabled global localisation. Our approach relies on the semantic labels present in the floorplan. Deep Learning is leveraged to extract semantic labels from RGB images, which are compared to the floorplan for localisation. While our approach is able to use range measurements if available, we demonstrate that they are unnecessary as we can achieve results comparable to state-of-the-art without them.

Highlights

  • Localisation, the process of finding a robot’s pose within a pre-existing map, is one of the most important aspects of both Computer Vision and Robotic systems

  • While the field of Monte-Carlo Localisation (MCL) evolved in the robotics community, non-MCL-based approaches became more popular in the vision community

  • Range-Based Monte-Carlo Localisation (RMCL) requires a floorplan and/or previously created rangescan map that is accurate in scale and globally consistent, this presents a number of challenges

Read more

Summary

Introduction

Localisation, the process of finding a robot’s pose within a pre-existing map, is one of the most important aspects of both Computer Vision and Robotic systems. Other examples include (Liu et al 2015), who use visual cues such as Vanishing Points (VPs) or (Chu et al 2015) who perform piecemeal 3D reconstructions that can be fitted back to an extruded floorplan These approaches use innovative ways to extract 3D information from images, the data extracted from the image is normally not contained in the floorplan that the sensor is meant to localise in. Semantic Detection and Ranging (SeDAR) is an innovative human-inspired framework that combines new semantic sensing capabilities with a novel semantic Monte-Carlo Localisation (MCL) approach. Experimental results show that the semantic labels are sufficiently strong visual cues that depth estimates are no longer needed Does this vision-only approach perform comparably to depth-based methods, it is capable of coping with floorplan inaccuracies more gracefully than strictly depth-based approaches. We present localisation results on several datasets and modalities

Literature Review
Monte-Carlo Localisation
Closed-Form Localisation Approaches
Problem Definition
Methodology
Semantic Labelling and Sensing
Floorplan
SeDAR Sensor
RGB-D to SeDAR
Semantic Monte-Carlo Localisation
Motion Model
Sensor Model
Semantically Adaptive Standard Deviation
Range-Less Semantic Scan-Matching
Evaluation
Human-Readable Floorplans
Detailed Analysis of a Single Trajectory
Cross-Trajectory Performance
Inaccurate Hand-Drawn Map
Benchmark Evaluation
Timing
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.