Video saliency prediction through machine learning with semantic information

Xiaohui Fu,Li Su,Lei Qin

doi:10.1109/chinasip.2015.7230461

Abstract

Saliency prediction is valuable in many video applications, such as intelligent retrieval, advertisement design and delivering, video coding and video summarization generating. Although image saliency is well explored, less works have been done on videos. Compared to images, the semantic orientation is more obvious for video saliency. In this paper, we propose a method to predict video saliency by introducing semantic information. Different from existing approaches, we simultaneously consider the bottom-up and top-down factors in a machine learning framework and utilize a semantic object learning model to compute the semantic related saliency map. The proposed method is tested on two datasets. The experiment results show that the proposed method keeps higher consistent with human's gaze tracks data on various video contents. Furthermore, the computation efficiency is also improved as we don't need to process every pixel of each frame during prediction features extraction.

Full Text