Abstract
Semantic image segmentation has recently witnessed considerable progress by training deep convolutional neural networks (CNNs). The core issue of this technique is the limited capacity of CNNs to depict visual objects. Existing approaches tend to utilize approximate inference in a discrete domain or additional aides and do not have a global optimum guarantee. We propose the use of the multi-label manifold ranking (MR) method in solving the linear objective energy function in a continuous domain to delineate visual objects and solve these problems. We present a novel embedded single stream optimization method based on the MR model to avoid approximations without sacrificing expressive power. In addition, we propose a novel network, which we refer to as dual multi-scale manifold ranking (DMSMR) network, that combines the dilated, multi-scale strategies with the single stream MR optimization method in the deep learning architecture to further improve the performance. Experiments on high resolution images, including close-range and remote sensing datasets, demonstrate that the proposed approach can achieve competitive accuracy without additional aides in an end-to-end manner.
Highlights
Semantic image segmentation, which aims to classify each pixel into one of the given categories, is an important task for understanding [1,2,3] and inferring objects [4,5,6] and their observed relations in a scene
The results demonstrate the superiority of the combination of multi-scale (MS), broader receptive field (Dilated), and manifold ranking optimization (MR-Opti) strategies, which can more accurately classify each pixel with varying spatial resolutions
We present a dual multi-scale manifold ranking (DMSMR) network for semantic image segmentation in a continuous domain
Summary
Semantic image segmentation, which aims to classify each pixel into one of the given categories, is an important task for understanding [1,2,3] and inferring objects [4,5,6] and their observed relations in a scene. Over the last five years, remarkable success in the semantic scene labeling area has been gained through the usage of convolutional neural networks (CNNs) [20,21,22,23,24,25,26] in dense prediction. Coarse pixel-wise labeling is obtained by multi-scale and dilation strategies, whereas the fine segmentation is conducted by optionally integrating contextual information into the output map. Active research has been conducted on these aspects, semantic image segmentation remains a challenging issue because of the complexity. The EvLab-SS dataset poses more challenge to researchers.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.