Abstract

Image semantic segmentation, a fundamental computer vision task, performs the pixel-wise classification of an image seeking to group pixels that share some semantic content. One of the main issues in semantic segmentation is the creation of fully annotated datasets where each image has one label per pixel. These annotations are highly time-consuming and, the more the labelling increases, the higher the percentage of human-entered errors grows. Segmentation methods based on less supervision can reduce both labelling time and noisy labels. However, when dealing with real-world applications, it is far from trivial to establish a method that minimizes labelling time while maximizing performance.Our main contribution is to present the first comprehensive study of state-of-the-art methods based on different levels of supervision. Image processing baselines, unsupervised, weakly supervised and supervised approaches have been evaluated. We aim to guide anyone approaching a new real-world use case by providing a trade-off between performance and supervision complexity on datasets from different domains, such as street scenes (Camvid), microscopy (MetalDAM), satellite (FloodNet) and medical images (NuCLS). Our experimental results suggest that: (i) unsupervised and weak learning perform well on majority classes, which helps to speed up labelling; (ii) weakly supervised can outperform fully supervised methods on minority classes; (iii) not all weak learning methods are robust to the nature of the dataset, especially those based on image-level annotations; and (iv) among all weakly supervised methods, point-based are the best-performing ones, even competing with fully supervised methods. The code is available at https://github.com/martafdezmAM/lessen_supervision.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call