Abstract

We consider the task of multi-view subspace learning which integrates multi-view information to learn a unified representation for multimedia data. In real-world scenarios, we encounter views with high diversities of semantic levels. Neglecting the problem of semantic inconsistency, existing graph-based methods directly convert heterogeneous information into local affinity matrices to conduct a fusion process, which inevitably destroys the valuable high-semantic-level structure. To address semantic inconsistency, we propose Multi-view Subspace Skeleton Embedding (MSSE), in which the high-level semantic structure of the learned subspace is explicitly taken as the skeleton of the learned subspace. Specifically, cooperating with a set of anchor points, the high-level semantic structure is adopted as semantic constraints to guide the multi-graph learning process based on RESCAL tensor factorization. To guarantee sufficient geometric coverage of the skeleton in the learned subspace, we enforce the diversity of anchor points by a Determinantal Point Process (DPP) regularizer. Compared with traditional methods, the learned subspace is endowed with higher semantic consistency and more robust to noisy views. Experiments on real-world image datasets demonstrate the promising performance comparing to state-of-the-art graph-based methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.