Abstract

Personalizing a music emotion recognition model is needed because the perception of music emotion is highly subjective, but it is a time-consuming process. In this paper, we consider how to expedite the personalization process that begins with a general model trained offline using a general user base and progressively adapts the model to a music listener using the emotion annotations of the listener. Specifically, we focus on reducing the number of user annotations needed for the personalization. We investigate and evaluate four component tying methods: single group tying, quadrantwise tying, hierarchical tying, and random tying. These methods aim to exploit the available annotations by identifying related model parameters on-the-fly and updating them jointly. In the evaluation, we use the AMG1608 dataset, which contains the clip-level valence-arousal emotion ratings of 1608 30-s music clips annotated by 665 listeners. Also, we use the acoustic emotion Gaussians model as the general model that uses a mixture of Gaussian components to learn the mapping between the acoustic feature space and the emotion space. The results show that the model adaptation with component tying requires only 10-20 personal annotations to obtain the same level of prediction accuracy as the baseline model adaptation method that uses 50 personal annotations without component tying.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call