Abstract
Cross-modal retrieval aims to narrow the heterogeneity gap between different modalities, such as retrieving images through texts or vice versa. One of the key challenges of cross-modal retrieval is the inconsistent distribution across diverse modalities. Most existing methods tend to construct a common representation subspace to overcome the challenge. However, the supervision information is not fully explored in most single-path cross-modal learning approaches. In this paper, we present a novel Parallel Learned generative adversarial network with Multi-path Subspaces (PLMS) for cross-modal retrieval. PLMS is a parallel learned architecture that aims to capture more effective information in an end-to-end trained cross-modal retrieval model. To be specific, a dual-branch network is constructed in the modality-specific generator, thereby the overall framework learns two common subspaces to emphasize discrepant supervision information and preserve more effective transformed features. We further design two objective functions for the training of the dual branches in generators. Through joint training, the feature representations generated by dual branches in a specific modality are fused for similarity measurement between modalities. To avoid redundancy and overlap during fusion, a Multi-source Domain Balancing (MDB) mechanism is presented to explore the contribution of the two specific-task branches. Extensive experiments show that our proposed method is effective and achieves state-of-the-art results on four widely-used databases.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.