Abstract

Environmental perception is crucial for unmanned mobile platforms such as autonomous vehicles and robots. Precise and fast semantic segmentation of the surrounding scene is a key task to enhance this capability. Existing real-time semantic segmentation networks are typically based on convolutional neural networks (CNNs), which have achieved good results, but they still lack control over global context features. In recent years, the Transformer architecture has achieved significant success in capturing global context, which is beneficial for improving segmentation accuracy. However, Transformers tend to ignore local connections, and their computational complexity makes real-time segmentation challenging. We propose a lightweight real-time semantic segmentation network called DTMC-Net, which combines the advantages of CNNs and Transformers. We design a special residual convolution module called the Lightweight Multi-layer Separable Convolution Attention module (LMSCA) to reduce the parameter count and perform multi-scale feature fusion to capture local features effectively. We introduce the Simple Dual-Resolution Transformer (SDR Transformer) that utilizes lightweight attention mechanisms and residual feed forward networks to capture and maintain features, with multiple bilateral fusions between two branches to exchange information. The proposed Anti-artifact Aggregation Pyramid Pooling Module (AAPPM) optimizes the upsampling process, refines features, and performs multi-scale feature fusion again. DTMC-Net only contains 4.2M parameters and achieves good performance on multiple public datasets with different scenarios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.