Abstract

Identifying perceived emotional content of music constitutes an important aspect of easy and efficient search, retrieval, and management of the media. One of the most promising use cases of music organization is an emotion-based playlist, where automatic music emotion recognition plays a significant role in providing emotion related information, which is otherwise, generally unavailable. Based on the importance of the auditory system in emotional recognition and processing, in this study, we propose a new cochleogram-based system for detecting the affective musical content. To effectively simulate the response of the human auditory periphery, the music audio signal is processed by a detailed biophysical cochlear model, thus obtaining an output that closely matches the characteristics of human hearing. In this proposed approach, based on the cochleogram images, which we construct directly from the response of the basilar membrane, a convolutional neural network (CNN) is used to extract the relevant music features. To validate the practical implications of the proposed approach with regard to its possible integration in different digital music libraries, an extensive study was conducted to evaluate the predictive performance of our approach in different aspects of music emotion recognition. The proposed approach was evaluated on publicly available 1000 songs database and the experimental results showed that it performed better in comparison with common musical features (such as tempo, mode, pitch, clarity, and perceptually motivated mel-frequency cepstral coefficients (MFCC)) as well as official ”MediaEval” challenge results on the same reference database. Our findings clearly show that the proposed approach can lead to better music emotion recognition performance and be used as part of a state-of-the-art music information retrieval system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call