Abstract

Spatio-temporal saliency detection has attracted lots of research interests due to its competitive performance on wide multimedia applications. For spatio-temporal saliency detection, existing bottom-up algorithms often over-simplify the fusion strategy, which results in the inferior performance than the human vision system. In this paper, a novel bottom-up spatio-temporal saliency model is proposed to improve the accuracy of attentional region estimation in videos through fully exploiting the merit of fusion. In order to represent the space constructed by several types of features such as location, appearance and temporal cues extracted from video, kernel regression in mixed feature spaces (KR-MFS) including three approximation entity-models is proposed. Using KR-MFS, a hybrid fusion strategy which considers the combination of spatial and temporal saliency of each individual unit and incorporates the impacts from the neighboring units is presented and embedded into the spatio-temporal saliency model. The proposed model has been evaluated on the publicly available dataset. Experimental results show that the proposed spatio-temporal saliency model can achieve better performance than the state-of-the-art approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.