Abstract

In recent years multimodal sentiment analysis (MSA) has been devoted to developing effective fusion mechanisms and has made advances, however, there are several challenges that have not been addressed adequately: the models make insufficient use of important information (inter-modal relevance and independence information) resulting in additional noise, and the traditional ternary symmetric architecture cannot well solve the problem of uneven distribution of task-related information among modalities. Thus, we propose Mutual Information Maximization and Feature Space Separation and Bi-Bimodal Modality Fusion (MFSBF)framework which effectively alleviates these problems. To alleviate the problem of underutilization of important information among modalities, a mutual information maximization module and a feature space separation module have been designed. The mutual information module maximizes the mutual information between two modalities to retain more relevance (modality-invariant) information, while the feature separation module separates fusion features to prevent the loss of independence(modality-specific) information during the fusion process. As different modalities contribute differently to the model, a bimodal fusion architecture is used, which involves the fusion of two bimodal pairs. The architecture focuses more on the modality that contains more task-ralated information and alleviates the problem of uneven distribution of useful information among modalities. The experiment results of our model on two publicly available datasets (CUM-MOSI and CUM-MOSEI) achieved better or comparable results than previous models, which demonstrate the efficacy of our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call