Abstract

Multi-modal multi-label learning provides a fundamental framework for complex objects, which can be represented with multiple modalities and annotated with multiple labels simultaneously. Different modalities can usually provide complementary information, which may lead to improved performance. What’s more, exploiting label correlations is crucially important to multi-label learning. However, most existing multi-label learning approaches do not sufficiently consider the complementary information among different modalities. In this paper, we propose a novel end-to-end deep learning framework named Rethinking Modal-oriented Label Correlations (RMLC), which sequentially polish the label prediction with each individual modality. In order to explicitly account for the correlated prediction of multiple labels, RMLC leverages an efficient sequential modal-based exploration to rethink label correlations. The final prediction of each label involves the collaboration between modal-specific prediction and the prediction of other labels based on cross-modal interaction. Comprehensive experiments on benchmark datasets validate the effectiveness and competitiveness of the proposed RMLC approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call