Abstract

Computational image memorability prediction has made significant progress in recent years. It is reported that we can robustly estimate the memorability of images with many different object and scene classes. However, the large scale data-based method including deep Convolutional Neural Networks (CNNs) showed a room for improvement when it was applied to smaller benchmark dataset. In this work, we investigate the missing link that causes such performance gap via in-depth qualitative analysis, and then provide suggestions to bridge the gap. Specifically, we study the relationship between the image memorability and the object spatial composition within the scene depicted by an image. Our hypothesis is that the image memorability is closely related to the composition of the scene, that is beyond mere location and existence. Experimental results show that the recent advances in scene parsing methods, which extracts contextual information of the image, may not only help better understanding of the image memorability and the object composition, but also show promising potential in improving computational memorability prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call