Visual attention prediction for Autism Spectrum Disorder with hierarchical semantic fusion

Yuming Fang,Haiyan Zhang,Yifan Zuo,Wenhui Jiang,Hanqin Huang,Jiebin Yan

doi:10.1016/j.image.2021.116186

Abstract

Visual attention for the diagnosis of Autism Spectrum Disorder (ASD) which is a kind of mental disorder has attracted the interests of increasing number of researchers. Although multiple visual attention prediction models have been proposed, this problem is still open. In this paper, considering the shift of visual attention, we propose that an image can be viewed as a pseudo sequence. Besides, we propose a novel visual attention prediction method for ASD with hierarchical semantic fusion (ASD-HSF). Specifically, the proposed model mainly contains a Spatial Feature Module (SFM) and a Pseudo Sequential Feature Module (PSFM). SFM is designed to extract spatial semantic features with a fully convolutional network, while PSFM implemented by two Convolutional Long Short-Term Memory networks (ConvLSTMs) is applied to learn pseudo sequential features. And the outputs of these two modules are fused to extract the final saliency map which simultaneously includes spatial semantic information and pseudo sequential information. Experimental results show that the proposed model not only outperforms ten state-of-the-art general saliency prediction counterparts, but also reaches the first and the second ranks under four metrics and the rest ones of ASD saliency prediction respectively.

Full Text