Abstract

In this paper, we discuss a method for a music performance detail analysis using multiresolution analysis allowing simultaneous estimation of pitch, precise onset, duration and intensity from polyphonic audio. The motivation is to obtain information that is detailed enough to develop a performance model of a human player. Characteristics of human performance can be observed as local and global tempo changes, sound intensity (volume or velocity in a MIDI), and articulations like slur and staccato. Estimation and extraction of such features from a musical audio signal in detail is useful for music information retrieval systems, automatic transcription systems, as well as automatic performance systems to train the relationship between music features and player performance. Our proposed system is based on non-negative matrix factorization (NMF) using hierarchical Bayesian inference, which is modeling harmonic and nonharmonic structures, note durations, intensities, and onset information stochastically. The estimation process comprises two steps. In the first step, variational Bayesian inference and a Gaussian mixture model is used to roughly estimate pitch onset, intensity and duration. These values are used as a prior for the second more detailed step, in which time resolution is doubled and the estimation is repeated to refine the results. The evaluation results show that the our proposed multiresolution Bayesian model can estimate more precise onset times and durations than our non-multiresolution Bayesian model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.