Abstract

Within the last fifteen years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or improve multimedia retrieval still remains a central challenge. In this perspective paper, we critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice-versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals. Dealing with music data requires the ability to handle both uncertainty and complex relational structure at multiple levels of representation. Traditional approaches have generally treated these two aspects separately, probability and learning being the standard way to represent uncertainty in knowledge, while logical representation being the standard way to represent knowledge and complex relational information. We advocate that the identified hurdles of current approaches could be overcome by recent developments in the area of Statistical Relational Artificial Intelligence (StarAI) that unifies probability, logic and (deep) learning. We show that existing approaches used in MIR find powerful extensions and unifications in StarAI, and we explain why we think it is time to consider the new perspectives offered by this promising research field.

Highlights

  • Understanding music has been a long-standing problem for very diverse communities

  • We critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals; simplified versions of Music Information Retrieval (MIR) problems cannot be generalized to real problems

  • The rest of the paper is organized as follows: section 2 critically reviews existing content-based MIR approaches, focusing on the task of automatic chord estimation as a case study, and identifies four major deficiencies of computational analysis models: the inability to handle both uncertainty and rich relational structure; the incapacity to handle multiple abstraction levels and the incapability to act on multiple time scales; the unemployment of available multimodal information, and the ineptitude to generalize simplified problems to complex tasks. section 3 discusses the need of an integrated research framework and presents the perspectives offered by statistical relational

Read more

Summary

INTRODUCTION

Understanding music has been a long-standing problem for very diverse communities. Trying to formalize musical knowledge, and to understand how human beings create and listen to music has proven to be very challenging given the huge amount of levels involved, ranging from hearing to perception, from acoustics to music theory. In the past 30 years, an impressive amount of research. Probability and Logic work in different fields related to music has been done in the aim of clarifying the relations between these levels and in order to find good representations for musical knowledge in different forms such as scores, intermediate graphic representations and so on

A Brief History of MIR
Progress in the MIR Field
The Need to Integrate Diverse
StarAI
Paper Organization
Probabilistic Graphical Models
Logic: Dealing With Complex Relational
Logic and Probability Theory
Complex Relational Structure at Multiple Abstraction Levels and Time
Capturing High-Level Information at Multiple
Increasing Amount of Various Heterogeneous
Some Possible Directions to Operate With
Toward Unification With StarAI
On the Benefits of an Integrated
A Case Study for MIR
Formal Definition of MLNs
Chord Estimation Markov Logic Network
Challenges of StarAI for the MIR
Potential Perspectives for MIR
CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.