Abstract

There is growing evidence that nonlinear time series analysis techniques can be used to successfully characterize, classify, or process signals derived from real-world dynamics even though these are not necessarily deterministic and stationary. In the present study, we proceed in this direction by addressing an important problem our modern society is facing, the automatic classification of digital information. In particular, we address the automatic identification of cover songs, i.e. alternative renditions of a previously recorded musical piece. For this purpose, we here propose a recurrence quantification analysis measure that allows the tracking of potentially curved and disrupted traces in cross recurrence plots (CRPs). We apply this measure to CRPs constructed from the state space representation of musical descriptor time series extracted from the raw audio signal. We show that our method identifies cover songs with a higher accuracy as compared to previously published techniques. Beyond the particular application proposed here, we discuss how our approach can be useful for the characterization of a variety of signals from different scientific disciplines. We study coupled Rössler dynamics with stochastically modulated mean frequencies as one concrete example to illustrate this point.

Highlights

  • An unprecedented growth in the availability of and access to digital information is taking place in today’s society, and music is a paradigmatic example

  • Music information retrieval (MIR) is the interdisciplinary research field that deals with these challenges [2]

  • While common MIR strategies characterize these time series by means of statistical modeling or machine learning techniques [3, 4, 5], raw descriptor time series are used for many tasks such as audio alignment and matching [6], song structure analysis [7], music similarity [8], audio fingerprinting [9], or cover song identification [10, 11, 12, 13, 14, 15, 16, 17, 18]

Read more

Summary

Introduction

An unprecedented growth in the availability of and access to digital information is taking place in today’s society, and music is a paradigmatic example. Online digital music collections are in the order of millions of tracks, and personal collections can exceed the practical limits on the time to listen to them [1] This huge amount of information readily accessible for end users poses major challenges for automatically describing, understanding, searching, retrieving, and organizing musical contents. In content-based MIR, much effort is focused on extracting information from the raw audio signal to represent certain musical aspects such as timbre, melody, main tonality, chords, or tempo [1] These features are computed in a short-time moving window either from a temporal, spectral, or cepstral representation of the audio signal [1], leading to a descriptor time series reflecting the temporal evolution of a given musical aspect. While common MIR strategies characterize these time series by means of statistical modeling or machine learning techniques [3, 4, 5], raw descriptor time series are used for many tasks such as audio alignment and matching [6], song structure analysis [7], music similarity [8], audio fingerprinting [9], or cover song identification [10, 11, 12, 13, 14, 15, 16, 17, 18]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.