Abstract
Lane detection plays an important role in autonomous driving. For video instance lane detection, both global spatial and temporal information is significantly important. However, the global spatial features and the temporal features are not been well exploited in recent studies. In this work, we address the video instance lane detection task by capturing global context based on non-local attention network. Specifically, we designed a twin non-local attention network to extract long-range dependencies along the spatial and temporal dimensions, respectively. Meanwhile, the global spatial and temporal features can be adaptively fused by gating mechanisms for better results. In addition, the frame-similarity loss is proposed to further exploit the information of adjacent frames. The experimental results on the video instance lane detection (VIL-100) dataset verify that our method achieves better results compared with other comparison methods. Ablation experiments further demonstrate the effectiveness of each sub-module.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.