Abstract
Based on transfer learning, feature maps of deep convolutional neural networks (DCNNs) have been used to predict human visual attention. In this paper, we conduct extensive comparisons to investigate effects of feature maps on the predictions of the human visual attention using a deep features based saliency model framework. The feature maps of seven pretrained DCNNs are investigated using classical and class activation maps approaches. The performances of various saliency implementations are evaluated over four datasets using three metrics. The results demonstrate that deep feature maps of the pretrained DCNNs can be used to create saliency maps for the prediction of human visual attention. The incorporation of multiple levels of blurred and multi-scale feature maps improves the extraction of salient regions. Moreover, DCNNs pretrained using the Places dataset provide more localized objects that can be beneficial to the top-down saliency maps.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Visual Communication and Image Representation
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.