Abstract

Multi-view classification aims to efficiently utilize information from different views to improve classification performance. In recent researches, many effective multi-view learning methods have been proposed to perform multi-view data analysis. However, most existing methods only consider the correlations between views but ignore the potential correlations between samples. Normally, the views of samples belonging to the same category should have more consistency information and those belonging to different categories should have more distinctions. Therefore, we argue that the correlations and distinctions between the views of different samples also contribute to the construction of feature representations that are more conducive to classification. In order to construct a end-to-end general multi-view classification framework that can better utilize sample information to obtain more reasonable feature representation, we propose a novel joint long and short span self-attention network (JLSSAN). We designed two different self-attention spans to focus on different information. They enable each feature vector to be iteratively updated based on its attention to other views and other samples, which provides better integration of information from different views and different samples. Besides, we adopt a novel weight-based loss fusion strategy, which facilitates the model to learn more reasonable self-attention map between views. Our method outperforms the state-of-the-art methods by more than 3% in accuracy on multiple benchmarks, which demonstrates that our method is effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call