The detection results of many existing co-saliency detection methods are easily interfered by the unrelated salient objects, which have similar appearance characteristics to co-salient objects. Therefore, mining the inter-saliency cues which contain the common category information of multiple related images is the core of co-saliency detection. To address above concern, a novel group weakly supervised learning induced co-saliency detection (GWSCoSal) model is proposed in this paper. First of all, a novel group class activation maps (GCAM) network is constructed and trained through a group weakly supervised learning scheme, which adopts the common category of a group of related images as the ground truth. The GCAM produced by the trained GCAM network are considered as the inter-saliency cues, which can only highlight the regions covered by the objects with common category. Afterwards, the GCAM are integrated into a feature pyramid networks (FPN) based backbone trained by the pixel-level labels to infer the co-saliency maps. The group weakly supervised and the pixel-level learning are jointly implemented for end-to-end training of GWSCoSal model. The comprehensive comparisons with 13 state-of-the-art methods demonstrate that, our GWSCoSal model can detect the co-salient objects more accurately under the condition of being interfered by the similar unrelated salient objects, and the overall performance of which has achieved the level of state-of-the-art methods. The ablation study of our GWSCoSal model validates the effectiveness of proposed GCAM network.
Read full abstract