Abstract

Labeling salient region accurately in video with cluttered background and complex motion condition is still a challenging work. In this paper, an efficient and low complexity spatiotemporal consistency optimization model, and a video saliency framework using the spatiotemporal consistency are proposed. We derive the superpixel-level spatial and temporal saliency value by integrating three spatial saliency features and two temporal saliency features respectively. After optimizing the spatial and temporal saliency map respectively using spatiotemporal consistency optimization model, the spatial and temporal saliency map are fused and enhanced by spatiotemporal consistency optimization. Finally, pixel-level salient regions are generated by graph-cuts algorithm. Experimental results on two challenging benchmark video datasets demonstrate the superiority and robustness of proposed spatiotemporal consistency optimization model and video saliency detection framework over state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call