Abstract

Real-time semantic segmentation has attracted wide attention in the computer vision field, and its performance depends on both rich semantic information and high-resolution information. Most networks fastly reduce image resolution or prune feature channels in the coding stage to improve the computing speed and concatenate the shallow feature to the deep feature in the decoding stage for filling in the high-resolution details. Here, we propose a new real-time semantic image segmentation method called guided down-sampling network. The guided down-sampling decomposes the original image into a group of compressed images that replace the original image as the input of the encoding layers. This operation reduces the size of the feature map and meanwhile retains the most spatial information of the original image. Furthermore, a two-branch sub-network is designed to extract semantic information and restore high-resolution image details from compressed images for better supervising feature learning. Our network is tested on the Cityscapes dataset on a single Nvidia GeForce GTX 1080Ti GPU, and the competitive results of 113 FPS and 75.6% mIoU have been achieved. The code is available at ( https://github.com/ldrunning/segmentation).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call