Abstract

Accurately estimating the current state of local traffic scenes is one of the key problems in the development of software components for automated vehicles. In addition to details on free space and drivability, static and dynamic traffic participants and information on the semantics may also be included in the desired representation. Multi-layer grid maps allow the inclusion of all of this information in a common representation. However, most existing grid mapping approaches only process range sensor measurements such as Lidar and Radar and solely model occupancy without semantic states. In order to add sensor redundancy and diversity, it is desired to add vision-based sensor setups in a common grid map representation. In this work, we present a semantic evidential grid mapping pipeline, including estimates for eight semantic classes, that is designed for straightforward fusion with range sensor data. Unlike other publications, our representation explicitly models uncertainties in the evidential model. We present results of our grid mapping pipeline based on a monocular vision setup and a stereo vision setup. Our mapping results are accurate and dense mapping due to the incorporation of a disparity- or depth-based ground surface estimation in the inverse perspective mapping. We conclude this paper by providing a detailed quantitative evaluation based on real traffic scenarios in the KITTI odometry benchmark dataset and demonstrating the advantages compared to other semantic grid mapping approaches.

Highlights

  • Environment perception modules in automated driving aim at solving a wide range of tasks

  • We present a semantic evidential grid mapping pipeline, including estimates for eight semantic classes, that is designed for straightforward fusion with range sensor data

  • We present results of our grid mapping pipeline based on a monocular vision setup and a stereo vision setup

Read more

Summary

Introduction

Environment perception modules in automated driving aim at solving a wide range of tasks. One of these is the robust and accurate detection and state estimation of other traffic participants in areas that are observable by on-board sensors. Occupancy grid maps are frequently considered, as they enable the detection of other traffic participants while modeling occlusions due to their dense grid structure. Most of the presented methods only include the processing of range sensor measurements such as Lidar and Radar and solely model occupancy without semantic states. In [1], we presented a semantic evidential fusion approach for multi-layer grid maps by introducing a refined set of hypotheses that allows the joint modeling of occupancy and semantic states in a common representation. We use the same evidence theoretical framework and present two improved sensor models for stereo vision and monocular vision that can be incorporated in the sensor data fusion presented in [1]

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.