Sentiment analysis refers to the mining of textual context, which is conducted with the aim of identifying and extracting subjective opinions in textual materials. However, most existing methods neglect other important modalities, e.g., the audio modality, which can provide intrinsic complementary knowledge for sentiment analysis. Furthermore, much work on sentiment analysis cannot continuously learn new sentiment analysis tasks or discover potential correlations among distinct modalities. To address these concerns, we propose a novel Lifelong Text-Audio Sentiment Analysis (LTASA) model to continuously learn text-audio sentiment analysis tasks, which effectively explores intrinsic semantic relationships from both intra-modality and inter-modality perspectives. More specifically, a modality-specific knowledge dictionary is developed for each modality to obtain shared intra-modality representations among various text-audio sentiment analysis tasks. Additionally, based on information dependence between text and audio knowledge dictionaries, a complementarity-aware subspace is developed to capture the latent nonlinear inter-modality complementary knowledge. To sequentially learn text-audio sentiment analysis tasks, a new online multi-task optimization pipeline is designed. Finally, we verify our model on three common datasets to show its superiority. Compared with some baseline representative methods, the capability of the LTASA model is significantly boosted in terms of five measurement indicators.
Read full abstract