Abstract

Recent years have witnessed the explosive growth of community-contributed images with rich context information, which is beneficial to the task of image retrieval. It can help us to learn a suitable metric to alleviate the semantic gap. In this paper, we propose a new distance metric learning algorithm, namely weakly-supervised deep metric learning (WDML), under the deep learning framework. It utilizes a progressive learning manner to discover knowledge by jointly exploiting the heterogeneous data structures from visual contents and user-provided tags of social images. The semantic structure in the textual space is expected to be well preserved while the problem of the noisy, incomplete or subjective tags is addressed by leveraging the visual structure in the original visual space. Besides, a sparse model with the ${\ell _{2,1}}$ mixed norm is imposed on the transformation matrix of the first layer in the deep architecture to compress the noisy or redundant visual features. The proposed problem is formulated as an optimization problem with a well-defined objective function and a simple yet efficient iterative algorithm is proposed to solve it. Extensive experiments on real-world social image datasets are conducted to verify the effectiveness of the proposed method for image retrieval. Encouraging experimental results are achieved compared with several representative metric learning methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.