Abstract

Understanding human visual attention is important for multimedia applications. Many studies have attempted to build saliency prediction models on natural images. However, limited efforts have been devoted to saliency prediction for Web pages, which are characterized by diverse content elements and spatial layouts. In this paper, we propose a novel end-to-end deep generative saliency model for Web pages. To capture position biases introduced by page layouts, a Position Prior Learning (PPL) sub-network is proposed, which models the position biases with a variational auto-encoder. To model different elements of a Web page, a Multi Discriminative Region Detection (MDRD) branch and a Text Region Detection (TRD) branch are introduced, which extract discriminative localizations and prominent text regions, respectively. We validate the proposed model with a public Web-page dataset 'FIWI', and show that the proposed model outperforms the state-of-art models for Web-page saliency prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call