Abstract

Saliency mapping is an efficient means of processing large amounts of incoming visual information from images and videos. Existing methods based on feature integration and construction error provide a rough saliency mapping and are sensitive to real-world noise. Furthermore, for first-person-vision video related to human activities, existing methods identify objects in the image without considering the human actions being performed. To address this issue, we propose a novel supervised saliency mapping based on a sparse coding framework. Different from the normal sparse representation, which uses dictionary atoms to represent signals, we use the inverse expression, whereby the use original image or video frame signals provide the normal dictionary, and these signals are used to represent the salience superpixel matrix, which is learned from a class of training images. We then construct the salient map by inverse sparse coding. In this paper, we first describe how the salience superpixel matrix is extracted by supervised selection, and then explain how to obtain the salient map through an inverse sparse coding framework. The experimental results show the enhanced accuracy of the proposed method compared with previous methods, as well as the time efficiency of the proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.