Abstract

The thriving progress of Convolutional Neural Networks (CNNs) and the outstanding efficacy of Visual Transformers (ViTs) have delivered impressive outcomes in the domain of semantic segmentation. However, each model in isolation entails a trade-off between high computational complexity and compromised computational efficiency. To address this challenge, we effectively combine the CNN and encoder-decoder structures in a Transformer-inspired fashion, presenting the Serial Semantic Segmentation Trans via CNN Former (SSS-Former) model. To augment the feature extraction capability, we utilize the meticulously crafted SSS-CSPNet, resulting in a well-designed architecture for the holistic model. We propose a novel SSS-PN attention network that enhances the spatial topological connections of features, leading to improved overall performance. Additionally, the integration of SASPP bridges the semantic gap between multi-scale features and enhances segmentation ability for overlapping objects. To fulfill the requirement of real-time segmentation, we leverage a novel restructuring technique to devise a more lightweight and faster ResSSS-Former model. Abundant experimental results demonstrate that both SSS-Former and ResSSS-Former outperform existing state-of-the-art methods in terms of computational efficiency, result precision, and speed. Remarkably, SSS-Former achieves a mIoU of 58.63 % at 89.1FPS on the ADE20K dataset. On the validation and testing datasets of CityScapes, it obtains mIoU scores of 85.1 % and 85.2 % respectively, with a speed of 94.1FPS. Our optimized ResSSS-Former achieves impressive real-time segmentation results, with an astonishing 100+FPS while maintaining high segmentation accuracy. The compelling results from the ISPRS datasets further validate the effectiveness of our proposed models in segmenting multi-scale and overlapping objects.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.