Unsupervised salient object detection is an important task in many real-world scenarios where pixel-wise label information is of scarce availability. Despite its significance, this problem remains rarely explored, with a few works that consider unsupervised salient object detection methods based on the fused graph from the sum fusion of multiple deep feature similarity matrices. However, these methods ignore the interrelation of the low-level feature similarity matrices and the high-level semantic similarity matrice, which degrades the quality of the fused graph. In this article, we propose a semantic-consistency-guided multi-graph fusion learning algorithm for unsupervised saliency detection, where the consistency and inconsistency between multiple low-level feature similarity matrices and the high-level semantic similarity matrice are explored to promote the robustness and quality of the fused graph. In the first stage, a semantic-consistency-guided multi-graph fusion learning method is proposed to exploit consistency and inconsistency of multiple low-level deep features and the high-level semantic feature. The semantic-consistency-guided similarity matrices are computed for preliminary saliency ranking. In the following saliency refinement stage, the semantic-enhanced similarity matrices are built by the cross diffusion to fuse the multiple low-level deep features and the high semantic deep feature. Based on the semantic-enhanced similarity matrices, the refinement saliency maps are calculated in a semantic-enhanced cellular automata manner. Furthermore, the final ensemble stage of the large margin semi-supervised classification views the preliminary ranking results and refinement results as features, adopts the large margin graphs for saliency ensemble. Extensive evaluations over four benchmark datasets show that the proposed unsupervised method performs favorably against the state-of-the-art approaches and is competitive with some supervised deep learning-based methods.
Read full abstract