Abstract

Due to the inconsistent distributions and representations of different modalities (e.g., images and texts), it is very challenging to correlate such heterogeneous data. A standard solution is to construct one common subspace, where the common representations of different modalities are generated to bridge the heterogeneity gap. Existing methods based on common representation learning mostly adopt a less effective two-stage paradigm: first, generating separate representations for each modality by exploiting the modality-specific properties as the complementary information, and then capturing the cross-modal correlation in the separate representations for common representation learning. Moreover, these methods usually neglect that there may exist interference in the modality-specific properties, that is, the unrelated objects and background regions in images or the noisy words and incorrect sentences in the text. In this article, we hypothesize that explicitly modeling the interference within each modality can improve the quality of common representation learning. To this end, we propose a novel model private-shared subspaces separation (P3S) to explicitly learn different representations that are partitioned into two kinds of subspaces: 1) the common representations that capture the cross-modal correlation in a shared subspace and 2) the private representations that model the interference within each modality in two private subspaces. By employing the orthogonality constraints between the shared subspace and the private subspaces during the one-stage joint learning procedure, our model is able to learn more effective common representations for different modalities in the shared subspace by fully excluding the interference within each modality. Extensive experiments conducted on cross-modal retrieval verify the advantages of our P3S method compared with 15 state-of-the-art methods on four widely used cross-modal datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call