Abstract

Real-time semantic segmentation is widely applied in many fields. However, current state-of-the-art methods ignore the inference speed, while some other models that have short run-times produce coarse segmentation results. To balance the inference speed and segmentation accuracy, we propose a Multi-scale Spatial Pyramid Pooling Network (MSPPNet), a lightweight and efficient network for real-time semantic segmentation. Here, we adopt modified Xception to obtain high-level and low-level feature maps, which fundamentally reduces computational complexity and the number of parameters. Besides, we design the Multi-scale Spatial Pyramid Pooling module (MSPP) to aggregate context information from high-level feature maps, which effectively improves segmentation accuracy. Furthermore, the spatial attention mechanism is employed to enrich the details of segmentation and recover object boundaries. Experiments on the Cityscapes dataset show that MSPPNet has less than 1M parameters, and achieves 64.55% mean IoU with a speed of 121 fps. It is demonstrated that MSPPNet achieves a balance between speed and accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call