Abstract

Existing saliency prediction methods focus on exploring a universal saliency model for natural images, relatively few on advertising images which typically consists of both textual regions and pictorial regions. To fill this gap, we first build an advertising image database, named ADD1000, recording 57 subjects’ eye movement data of 1000 ad images. Compared to natural images, advertising images contain more artificial scenarios and show stronger persuasiveness and deliberateness, while the impact of this scene heterogeneity on visual attention is rarely studied. Moreover, text elements and picture elements express closely related semantic information to highlight product or brand in ad images, while their respective contribution to visual attention is also less known. Motivated by these, we further propose a saliency prediction model for advertising images based on text enhanced learning (TEL-SP), which comprehensively considers the interplay between textual region and pictorial region. Extensive experiments on the ADD1000 database show that the proposed model outperforms existing state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call