SalGCN

Haoran Lv,Wenrui Dai,Junni Zou,Hongkai Xiong,Chenglin Li,Qin Yang

doi:10.1145/3394171.3413733

Abstract

The non-Euclidean geometry characteristic poses a challenge to the saliency prediction for 360-degree images. Since spherical data cannot be projected onto a single plane without distortion, existing saliency prediction methods based on traditional CNNs are inefficient. In this paper, we propose a saliency prediction framework for 360-degree images based on graph convolutional networks (SalGCN), which directly applies to the spherical graph signals. Specifically, we adopt the Geodesic ICOsahedral Pixelation (GICOPix) to construct a spherical graph signal from a spherical image in equirectangular projection (ERP) format. We then propose a graph saliency prediction network to directly extract the spherical features and generate the spherical graph saliency map, where we design an unpooling method suitable for spherical graph signals based on linear interpolation. The network training process is realized by modeling the node regression problem of the input and output spherical graph signals, where we further design a Kullback-Leibler (KL) divergence loss with sparse consistency to make the sparseness of the saliency map closer to the ground truth. Eventually, to obtain the ERP format saliency map for evaluation, we further propose a spherical crown-based (SCB) interpolation method to convert the output spherical graph saliency map into a saliency map in ERP format. Experiments show that our SalGCN can achieve comparable or even better saliency prediction performance both subjectively and objectively, with a much lower computation complexity.

Full Text