Abstract

To achieve high-resolution segmentation results, typical semantic segmentation models often require high-resolution inputs. However, high-resolution inputs inevitably bring high cost on computation, which limits its application seriously in realistic scenarios. To address the problem, we propose to predict a high-resolution semantic segmentation result with a degraded low-resolution image as input, which is called super-resolution semantic segmentation in this paper. We further propose a Relation Calibrating Network (RCNet) for this task. Specifically, we propose two modules, namely Relation Upsampling Module (RUM) and Feature Calibrating Module (FCM). In RUM, the input feature map generates the relation map of pixels in low-resolution, which is then gradually upsampled to high-resolution. Meanwhile, FCM takes the input feature map and the relation map from RUM as inputs, gradually calibrating the feature. Finally, the last FCM outputs the high-resolution segmentation results. We conduct extensive experiments to verify the effectiveness of our method. Specially, we achieve a comparable segmentation result (from 70.01% to 70.90%) with only 1/4 of the computational cost (from 1107.57 to 255.72 GFLOPs) based on FCN on Cityscapes dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call