Abstract

This paper proposes a new perspective-Vicept representation to solve the problem of visual polysemia and concept polymorphism in the large-scale semantic image understanding. Vicept characterizes the membership probability distribution between visual appearances and semantic concepts, and forms a hierarchical representation of image semantic from local to global. In the implementation, incorporating group sparse coding, visual appearance is encoded as a weighted sum of dictionary elements, which could obtain more accurate image representation with sparsity at the image level. To obtain discriminative Vicept descriptions with structural sparsity, mixed-norm regularization is adopted in the optimization problem for learning the concept membership distribution of visual appearance. Furthermore, we introduce a novel image distance measurement based on the hierarchical Vicept description, where different levels of Vicept distance are fused together by multi-level separability analysis. Finally, the wide applications of Vicept description are validated in our experiments, including large-scale semantic image search, image annotation, and semantic image re-ranking.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.