Abstract

Visual scene understanding mainly depends on pixel-wise classification obtained from a deep convolutional neural network. However, existing semantic segmentation models often face difficulties in real-time applications due to their large network architecture. Although there are real-time semantic segmentation models available, their shallow backbone can degrade the performance considerably. This paper introduces SDBNetV2, a lightweight semantic segmentation model designed to improve real-time performance without increasing computational costs. A key contribution is a novel Short-term Dense Bottleneck (SDB) module in the encoder, which provides varied field-of-views to capture different geometrical objects in a complex scene. Additionally, we propose dense feature refinement and improved semantic aggregation modules at the decoder end to enhance contextualization and object localization. We evaluate the proposed model’s performance on several indoor and outdoor datasets in structured and unstructured environments. The results show that SDBNetV2 achieves superior segmentation performance over other real-time models with less than 2 million parameters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call