Abstract

This paper investigates to which extent state of the art machine learning methods are effective in classifying emotions in the context of individual musical instruments, and how their performances compare with musically trained and untrained listeners. To address these questions we created a novel dataset of 391 classical and acoustic guitar excerpts annotated along four emotions (aggressiveness, relaxation, happiness and sadness) with three emotion intensity levels (low, medium, high), according to the intended emotion of 30 professional guitarists acting as both composers and performers. A first experiment investigated listeners’ perception involving 8 professional guitarists and 8 non-musicians. Results showed that the emotions intended by a composer-performer are not always well recognized by listeners, and in general not with the same intensity. Listeners’ identification accuracy was proportional to the intensity with which an emotion was expressed. Emotions were better recognized by musicians than by listeners without musical background. Such differences between the two groups were found for different intensity levels of the intended emotions. A second experiment investigated machine listening performance based on a transfer learning method. To compare machine and human identification accuracies fairly, we derived a fifth, “ambivalent” category from the machine listening output categories (i.e., excerpts rated with more than one predominant emotion). Results showed that the machine perception of emotions matched or even exceeded musicians’ performance for all emotions except “relaxation”. The differences between the intended and human-perceived emotions, as well as those due to musical training, suggest that a device or application involving a music emotion recognition system should take into account the characteristics of the users (in particular their musical expertise) as well as their roles (e.g., composers, performers, listeners). For developers this translates into the use of datasets annotated by different categories of annotators, whose role and musical expertise will match the characteristics of the end users. Such results are particularly relevant to the creation of emotionally-aware smart musical instruments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call