Abstract
As the forward-looking depth information plays a considerable role in advanced driving assistance systems, in this paper, we first propose a method of depth map estimation based on semi-supervised learning, which uses the left and right views of binocular vision and sparse depth values as inputs to train a deep learning network with an encoding–decoding structure. Compared with unsupervised networks without sparse depth labels, the proposed semi-supervised network improves the estimation accuracy of depth maps. Secondly, this paper combines the estimated depth map with the results of instance segmentation to measure the distance between the subject vehicle and the target vehicle or pedestrian. Specifically, for measuring the distance between the subject vehicle and a pedestrian, this paper proposes a depth histogram-based method that calculates the average depth values of all pixels whose depth values are in the peak range of the depth histogram of this pedestrian. To measure the distance between the subject vehicle and the target vehicle, this paper proposes a method that first fits a 3-D plane based on the locations of target points in the camera body coordinate using RANSAC (RANdom SAmple Consensus), it then projects all the pixels of the target to this plane, and finally uses the minimum depth value of these projected points to calculate the distance to the target vehicle. The results of the quantitative and qualitative comparisons on the KITTI dataset show that the proposed method can effectively estimate depth maps. The experimental results in real road scenarios and the KITTI dataset confirm the accuracy of the proposed distance measurement methods.
Highlights
In order to improve road safety, both the scientific community and manufacturers must pay more attention to the development of automobile safety technology
In in order order to reduce the influence influence of the noise and error of depth map estimation, estimation, we present different methods to measure this distance according to different objects
KITTI is a popular dataset which can be used for vision algorithm testing of Advanced Driving Assistance Systems (ADASs); it contains a large number of stereo image pairs captured from a car driving in an urban scenario and provides sparse depth data matched with the stereo vision
Summary
In order to improve road safety, both the scientific community and manufacturers must pay more attention to the development of automobile safety technology. On the basis of the depth map of the input image, focusing on the two main participants in road traffic activities, i.e., pedestrians and vehicles, this paper further proposes a pedestrian–vehicle and a vehicle–vehicle distance measurement method. The proposed system needs to use stereo image pairs and sparse depth information to train a semi-supervised network, it is considered to be a monocular vision-based approach because it only needs to input a single image when used online in V-DAS. The experimental results on the public dataset KITTI and in real road scenarios illustrate that the proposed system can use a single vehicle forward-looking image to obtain its corresponding pixel-level depth information and accurately predict the distances to different targets to meet the needs of ADAS. The conclusions and future work are presented and discussed in the final section
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have