Abstract

Understanding urban scenes is a fundamental ability requirement for assisted driving and autonomous vehicles. Most of the available urban scene understanding methods use red-green-blue (RGB) images; however, their segmentation performances are prone to degradation under adverse lighting conditions. Recently, many effective artificial neural networks have been presented for urban scene understanding and have shown that incorporating RGB and thermal (RGB-T) images can improve segmentation accuracy even under unsatisfactory lighting conditions. However, the potential of multimodal feature fusion has not been fully exploited because operations such as simply concatenating the RGB and thermal features or averaging their maps have been adopted. To improve the fusion of multimodal features and the segmentation accuracy, we propose a multitask-aware network (MTANet) with hierarchical multimodal fusion (multiscale fusion strategy) for RGB-T urban scene understanding. We developed a hierarchical multimodal fusion module to enhance feature fusion and built a high-level semantic module to extract semantic information for merging with coarse features at various abstraction levels. Using the multilevel fusion module, we exploited low-, mid-, and high-level fusion to improve segmentation accuracy. The multitask module uses boundary, binary, and semantic supervision to optimize the MTANet parameters. Extensive experiments were performed on two benchmark RGB-T datasets to verify the improved performance of the proposed MTANet compared with state-of-the-art methods. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.