Abstract

The number of images associated with weakly supervised user-provided tags has increased dramatically in recent years. User-provided tags are incomplete, subjective and noisy. In this paper, we focus on the problem of social image understanding, i.e., tag refinement, tag assignment, and image retrieval. Different from previous work, we propose a novel weakly supervised deep matrix factorization algorithm, which uncovers the latent image representations and tag representations embedded in the latent subspace by collaboratively exploring the weakly supervised tagging information, the visual structure, and the semantic structure. Due to the well-known semantic gap, the hidden representations of images are learned by a hierarchical model, which are progressively transformed from the visual feature space. It can naturally embed new images into the subspace using the learned deep architecture. The semantic and visual structures are jointly incorporated to learn a semantic subspace without overfitting the noisy, incomplete, or subjective tags. Besides, to remove the noisy or redundant visual features, a sparse model is imposed on the transformation matrix of the first layer in the deep architecture. Finally, a unified optimization problem with a well-defined objective function is developed to formulate the proposed problem and solved by a gradient descent procedure with curvilinear search. Extensive experiments on real-world social image databases are conducted on the tasks of image understanding: image tag refinement, assignment, and retrieval. Encouraging results are achieved with comparison with the state-of-the-art algorithms, which demonstrates the effectiveness of the proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.