Abstract

Typical visual simultaneous localization and mapping (SLAM) systems rely on front-end odometry for feature extraction and matching to establish the relations between adjacent images. In a low-light environment, the image obtained by a camera is dim and shows scarce information, hindering the extraction of sufficient stable feature points, consequently undermining visual SLAM. Most existing methods focus on low-light enhancement of a single image, neglecting the strong temporal correlation across images in visual SLAM. We propose a method that leverages the temporal information of an input image sequence to enhance the low-light image and employed the enhanced result to improve the feature extraction and matching quality of visual SLAM. Our method trains a three-dimensional convolutional neural network to estimate pixelwise grayscale transformation curves to obtain a low-light enhancement image. Then, the grayscale transformation curves are iteratively applied to obtain the final enhanced result. The training process of the network does not require any paired reference images. We also introduced a spatial consistency loss for the enhanced image to retain the content and texture of the original image. We further integrated our method into VINS-Mono and compared with similar low-light image enhancement methods on the TUM-VI public dataset. The proposed method provides a lower positioning error. The positioning root-mean-squared error of our method is 19.83% lower than that of Zero-DCE++ in low-light environments. Moreover, the proposed network achieves real-time operation, being suitable for integration into a SLAM system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call