Abstract

In this paper, we propose a semantic segmentation method based on superpixel region merging and convolutional neural network (CNN), referred to as regional merging neural network (RMNN). Image annotation has always been an important role in weakly-supervised semantic segmentation. Most methods use manual labeling. In this paper, super-pixels with similar features are combined using the relationship between each pixel after super-pixel segmentation to form a plurality of super-pixel blocks. Rough predictions are generated by the fully convolutional networks (FCN) so that certain super-pixel blocks will be labeled. We perceive and find other positive areas in an iterative way through the marked areas. This reduces the feature extraction vector and reduces the data dimension due to super-pixels. The algorithm not only uses superpixel merging to narrow down the target’s range but also compensates for the lack of weakly-supervised semantic segmentation at the pixel level. In the training of the network, we use the method of region merging to improve the accuracy of contour recognition. Our extensive experiments demonstrated the effectiveness of the proposed method with the PASCAL VOC 2012 dataset. In particular, evaluation results show that the mean intersection over union (mIoU) score of our method reaches as high as 44.6%. Because the cavity convolution is in the pooled downsampling operation, it does not degrade the network’s receptive field, thereby ensuring the accuracy of image semantic segmentation. The findings of this work thus open the door to leveraging the dilated convolution to improve the recognition accuracy of small objects.

Highlights

  • In the last few years, convolutional neural networks [1,2,3,4] (CNN) have had widespread applications in various industries

  • According to theoretical hypothesis [25,31], we present an image semantic segmentation method based on superpixel region merger and CNN

  • This paper proposes a weakly-supervised semantic segmentation method using superpixel aggregation as an annotation

Read more

Summary

Introduction

In the last few years, convolutional neural networks [1,2,3,4] (CNN) have had widespread applications in various industries. Yan et al [22] proposed a model for assigning labels to super-pixels by learning related features, which are used to merge superpixel blocks to extract candidate regions. Liu et al [23] proposed a weakly-supervised method based on graph propagation, which automatically assigns image-level labels to the super-pixel context information. The proposed approach works as follows: Firstly, the original image is subjected to simple linear iterative clustering (SLIC) superpixel segmentation [32,33], and three criteria (i.e., full lambda, spectral histogram or color-texture model) are used for super-pixel region merging. Move the clustering point to the position with the smallest gradient in 3 * 3. 2. 3M. oveStehtethcleuslatebreinlgspigoinnt =to −th1e,paonsditiothnewditihsttahnecsemdalle=st∞grafdoirenetaicnh3p*ix3.el i

Iterate over the points in the collection
Algorithm Framework
Experiment Analysis
Qualitative Analysis
Findings
Conclusions and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.