Abstract

Music plays a vital role in human culture and society, serving as a universal form of expression. However, accurately classifying music emotions remains challenging due to the intricate nature of emotional expressions in music and the integration of diverse data sources. To address these challenges, we propose the Multilayered Music Decomposition and Multimodal Integration Interaction (MMD-MII) model. This model employs cross-processing to facilitate interaction between audio and lyrics, ensuring coherence in emotional representation. Additionally, we introduce a hierarchical framework based on the music theory, focusing on the main and chorus sections, with the chorus processed separately to extract precise emotional representations. Experimental results on the DEAM and FMA datasets demonstrate the effectiveness of the MMD-MII model, achieving accuracies of 49.68% and 49.54% respectively. Compared with the existing methods, our model outperforms in accuracy and F1 scores, offering promising implications for music recommendation systems, healthcare, psychology, and advertising, where accurate emotional analysis is essential.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call