Abstract

AbstractWe address RGB road scene material segmentation, i.e., per-pixel segmentation of materials in real-world driving views with pure RGB images, by building a new tailored benchmark dataset and model for it. Our new dataset, KITTI-Materials, based on the well-established KITTI dataset, consists of 1000 frames covering 24 different road scenes of urban/suburban landscapes, annotated with one of 20 material categories for every pixel in high quality. It is the first dataset tailored to RGB material segmentation in realistic driving scenes which allows us to train and test any RGB material segmentation model. Based on an analysis on KITTI-Materials, we identify the extraction and fusion of texture and context as the key to robust road scene material appearance. We introduce Road scene Material Segmentation Network (RMSNet), a new Transformer-based framework which will serve as a baseline for this challenging task. RMSNet encodes multi-scale hierarchical features with self-attention. We construct the decoder of RMSNet based on a novel lightweight self-attention model, which we refer to as SAMixer. SAMixer achieves adaptive fusion of informative texture and context cues across multiple feature levels. It also significantly accelerates self-attention for feature fusion with a balanced query-key similarity measure. We also introduce a built-in bottleneck of local statistics to achieve further efficiency and accuracy. Extensive experiments on KITTI-Materials validate the effectiveness of our RMSNet. We believe our work lays a solid foundation for further studies on RGB road scene material segmentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.