Enhancing Music Mood Recognition with LLMs and Audio Signal Processing: A Multimodal Approach

Prof R.Y Sable,Kosheen Sadhu,Aqsa Sayyed,Prathamesh Ghatole,Baliraje Kalyane

doi:10.22214/ijraset.2024.63590

Abstract

Abstract: Music Mood Recognition aims to allow computers to understand the emotions behind music the way humans do, in order to facilitate better perception of media by computers to aid in enhanced services like music recommendations, therapeutic interventions, and Human Computer Interaction. In this paper, we propose a novel approach to improving Music Mood Recognition using a multi-modal model that uses lyrical and audio features of a song. Lyrical features are analysed using stateof-the-art open-source Large Language Models like Microsoft Phi-3 to classify lyrics from one of the four possible emotion categories as per the James Russel Circumplex Model. Audio features are used to train a Deep Learning (ConvNet) model to predict emotion classes. A multimodal combiner model with Audio and Lyrics is then trained and deployed to enable accurate predictions. The dataset used in this research is “MoodyLyrics”, a collection of 2000+ songs classified with one of 4 possible emotion classes as per the James Russel Circumplex Model. Due to compute limitations, we are using a balanced set of 1000 songs to train and test our models. The workin this paper outperforms most other multimodal researches by allowing higher accuracies with universal language support

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Enhancing Music Mood Recognition with LLMs and Audio Signal Processing: A Multimodal Approach

Abstract

Talk to us

Similar Papers

More From: International Journal for Research in Applied Science and Engineering Technology

Lead the way for us

Similar Papers

Music Emotion Detection using Weighted of Audio and Lyric Features
Fika Hastarita Rachman ... Riyanarto Sarno
-
Fika Hastarita Rachman, et. al.Fika Hastarita Rachman ... Riyanarto Sarno
14 Oct 2020
14 Oct 2020

Multimodal Deep Learning Network for Differentiating Between Benign and Malignant Pulmonary Ground Glass Nodules.
Gang Liu ... Hui He
Current medical imaging | VOL. -
Gang Liu, et. al.Gang Liu ... Hui He
10 Sep 2024
Current medical imaging | VOL. -

Application of multimodal deep learning using radar and water level data for water level prediction
Seongsim Yoon ... Seyong Kim
-
Seongsim Yoon, et. al.Seongsim Yoon ... Seyong Kim
15 May 2023
15 May 2023

Predicting Obstructive Sleep Apnea Based on Computed Tomography Scans Using Deep Learning Models.
Jeong-Whun Kim ... Jin Youp Kim
American journal of respiratory and critical care medicine | VOL. 210
Jeong-Whun Kim, et. al.Jeong-Whun Kim ... Jin Youp Kim
12 Mar 2024
American journal of respiratory and critical care medicine | VOL. 210

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhancing Music Mood Recognition with LLMs and Audio Signal Processing: A Multimodal Approach

Abstract

Talk to us

Similar Papers

More From: International Journal for Research in Applied Science and Engineering Technology