Abstract

Fine-grained image-text retrieval aims at searching relevant images among fine-grained classes given a text query or in a reverse way. The challenges are not only bridging the gap between two heterogeneous modalities but also dealing with large inter-class similarity and intra-class variance existed in fine-grained data. To deal with the above challenges, we propose a Discriminative Latent Space Learning (DLSL) method for fine-grained image-text retrieval. Concretely, image and text features are extracted for capturing the subtle difference in fine-grained data. Subsequently, based on the extracted features, we perform couple dictionary learning to align the heterogeneous data in a uniform latent space. To make such alignment discriminative enough for the fine-grained task, the learned latent space is endowed with discriminative property via learning a discriminative map. Comprehensive experiments on fine-grained datasets demonstrate the effectiveness of our approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.