We propose a new image level weakly supervised segmentation approach for datasets with a single object class of interest. Our approach is based on a regularized loss function inspired by the classical Conditional Random Field (CRF) modeling. Our loss models properties of generic objects, and we use it to guide CNN towards segments that are more likely to correspond to the object, thus avoiding the need for pixel precise annotations. Training CNN with regularized loss is a difficult task for gradient descent. We develop an annealing algorithm which is crucial for a successful training. Furthermore, we develop an approach for hyperparameter setting for the most important components of our regularized loss. This is far from trivial, since there is no pixel precise ground truth for guidance. The advantage of our method is that we use a standard CNN architecture and an easy to interpret loss function, derived from classical CRF models. Furthermore, we apply the same loss function for any task/dataset. We first evaluate our approach for salient object segmentation and co-segmentation. These tasks naturally involve one object class of interest. Then we adapt our approach to image level weakly supervised multi-class semantic segmentation. We obtain state-of-the-art results.
Read full abstract