Abstract
Abstract. Simultaneous Localization and Mapping are the key requirements for many practical applications of robotics. However, traditional visual approaches rely on features extracted from textured surfaces, so they barely work well in indoor scenes (e.g. long corridors containing large proportions of smooth walls). In this work, we propose a novel visual odometry method to overcome these limitations, which integrates structural regularities of man-made environments in a direct sparse visual odometry system. By fully exploiting structural lines that align with the dominant direction in the Manhattan world, our approach becomes more accurate and robust to texture-less indoor environments, specially, long corridors. Given a series of image inputs, we first use the direct sparse method to obtain the coarse relative pose between camera frames, and then calculate vanishing points on each frame. Secondly, we use structural lines as rotation constraints, and perform a sliding window optimization to reduce both photometric and rotation errors, to further improve the trajectory accuracy. Through the benchmark test, it is proved that our method performs better than that of the existing visual odometry approach in long corridor environments.
Highlights
Estimating the position and orientation of an agent in an indoor scene is a challenging problem, that is usually addressed by Simultaneous Localization and Mapping (SLAM) technologies (Bailey, Durrant-Whyte, 2006)
The advantage of using this structural regularity in a visual odometry (VO) system is obvious: parallel lines aligned with the Manhattan world create direction constraints that prevent local direction errors from growing
We described the combination in detail, including Manhattan world representation, structural lines parameterizationand error terms designing
Summary
Estimating the position and orientation of an agent in an indoor scene is a challenging problem, that is usually addressed by Simultaneous Localization and Mapping (SLAM) technologies (Bailey, Durrant-Whyte, 2006). Architectural scenes, having planes, texture-less walls, sharp angles and axially aligned geometries, often exhibit strong structural regularity, including parallelism and orthogonality, as shown in Figure 1 (Zhou et al, 2019). The existence of these structures provides an opportunity to constrain and simplify pose estimation. Such scenes can be abstracted as Manhattan world (Coughlan, Yuille, 1999). The advantage of using this structural regularity in a VO system is obvious: parallel lines aligned with the Manhattan world create direction constraints that prevent local direction errors from growing. We designed new error terms to merge the structural information in local sliding window optimizations which are performed on several keyframes to refine the camera pose and the point depth
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have