Abstract

In this paper we propose a complete framework for automatic detection and tracking of salient objects in video streams. The video flow is firstly segmented into shots based on scale space filtering graph partition method. For each detected shot the associated static summary is developed using a leap keyframe extraction method. Based on the representative images we introduce next a combined spatial and temporal video attention model that is able to recognize both interesting objects and actions in image sequences. The approach extends the state-of-the-art image region based contrast saliency with a temporal attention model. Different types of motion presented in the current shot are determined using a set of homographic transforms, estimated by recursively applying the RANSAC algorithm on the interest point correspondence. Finally, a decision is taken based on the combined information from both saliency maps. The experimental results validate the proposed framework and demonstrate that our approach is suitable for various types of videos and is robust to noise and low resolution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call