MMD-MII Model: A Multilayered Analysis and Multimodal Integration Interaction Approach Revolutionizing Music Emotion Classification

Jingyi Wang,Thippa Reddy Gadekallu,Alireza Sharifi,Achyut Shankar

doi:10.1007/s44196-024-00489-6

Jingyi Wang, Thippa Reddy Gadekallu + Show 2 more

Open Access

https://doi.org/10.1007/s44196-024-00489-6

Copy DOI

Abstract

Music plays a vital role in human culture and society, serving as a universal form of expression. However, accurately classifying music emotions remains challenging due to the intricate nature of emotional expressions in music and the integration of diverse data sources. To address these challenges, we propose the Multilayered Music Decomposition and Multimodal Integration Interaction (MMD-MII) model. This model employs cross-processing to facilitate interaction between audio and lyrics, ensuring coherence in emotional representation. Additionally, we introduce a hierarchical framework based on the music theory, focusing on the main and chorus sections, with the chorus processed separately to extract precise emotional representations. Experimental results on the DEAM and FMA datasets demonstrate the effectiveness of the MMD-MII model, achieving accuracies of 49.68% and 49.54% respectively. Compared with the existing methods, our model outperforms in accuracy and F1 scores, offering promising implications for music recommendation systems, healthcare, psychology, and advertising, where accurate emotional analysis is essential.

Full Text