Abstract

Saliency detection in videos has attracted great attention in recent years due to its wide range of applications, such as object detection and recognition. A novel spatiotemporal saliency detection model is proposed in this paper. The discrete cosine transform coefficients are used as features to generate the spatial saliency maps firstly. Then, a hierarchical structure is utilized to filter motion vectors that might belong to the background. The extracted motion vectors can be used to obtain the rough temporal saliency map. In addition, there are still some outliers in the temporal saliency map and we use the macro-block information to revise it. Finally, an adaptive fusion method is used to merge the spatial and temporal saliency maps of each frame into its spatiotemporal saliency map. The proposed spatiotemporal saliency detection model has been extensively tested on several video sequences, and show to outperform (more than 0.127 in AUC and 0.182 in F-measure on average) various state-of-the-art models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call