Abstract

One of the central goals of Music Information Retrieval (MIR) is the quantification of similarity between or within pieces of music. These quantitative relations should mirror the human perception of music similarity, which is however highly subjective with low inter-rater agreement. Unfortunately this principal problem has been given little attention in MIR so far. Since it is not meaningful to have computational models that go beyond the level of human agreement, these levels of inter-rater agreement present a natural upper bound for any algorithmic approach. We will illustrate this fundamental problem in the evaluation of MIR systems using results from two typical application scenarios: (i) modelling of music similarity between pieces of music; (ii) music structure analysis within pieces of music. For both applications, we derive upper bounds of performance which are due to the limited inter-rater agreement. We compare these upper bounds to the performance of state-of-the-art MIR systems and show how the upper bounds prevent further progress in developing better MIR systems.

Highlights

  • The most important concept in Music Information Retrieval (MIR) is that of music similarity

  • MIREX is an annual evaluation campaign for MIR algorithms allowing for a fair comparison in standardized settings in a range of different tasks.As such it has been of great value for the MIR community and an important driving force of research and progress within the community

  • In our review on related work we focus on papers directly discussing results of the AMS task, thereby addressing the problem of evaluation of audio music similarity

Read more

Summary

Introduction

The most important concept in Music Information Retrieval (MIR) is that of music similarity. Proper modelling of music similarity is at the heart of every application allowing automatic organization and processing of music databases. Music similarity can be modelled at many different levels, e.g. between complete pieces of music or by exploring structure within pieces of music. Respective tasks at the annual ‘Music Information Retrieval Evaluation eXchange’(MIREX1, 2006; Downie et al, 2014) are the ‘Audio Music Similarity and Retrieval’ (AMS) task and the ‘Music Structural Segmentation’(MSS) task. MIREX is an annual evaluation campaign for MIR algorithms allowing for a fair comparison in standardized settings in a range of different tasks.As such it has been of great value for the MIR community and an important driving force of research and progress within the community. It has even been stated that evaluation campaigns like MIREX ‘define de facto the topics that new contributors to the MIR field will work on’ (Serra et al, 2013, p. 33)

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call