Abstract

This paper proposes a system for score-informed audio source separation for multichannel orchestral recordings. The orchestral music repertoire relies on the existence of scores. Thus, a reliable separation requires a good alignment of the score with the audio of the performance. To that extent, automatic score alignment methods are reliable when allowing a tolerance window around the actual onset and offset. Moreover, several factors increase the difficulty of our task: a high reverberant image, large ensembles having rich polyphony, and a large variety of instruments recorded within a distant-microphone setup. To solve these problems, we design context-specific methods such as the refinement of score-following output in order to obtain a more precise alignment. Moreover, we extend a close-microphone separation framework to deal with the distant-microphone orchestral recordings. Then, we propose the first open evaluation dataset in this musical context, including annotations of the notes played by multiple instruments from an orchestral ensemble. The evaluation aims at analyzing the interactions of important parts of the separation framework on the quality of separation. Results show that we are able to align the original score with the audio of the performance and separate the sources corresponding to the instrument sections.

Highlights

  • Western classical music is a centuries-old heritage traditionally driven by well-established practices

  • We are interested in establishing a methodology for this task for future research and we propose a dataset in order to objectively assess the contribution of each part of the separation framework to the quality of separation

  • A graphic of the initialization of the framework with the four test cases listed above (Ali, extending the boundaries of the notes (Ext), Ref1, and Ref2), along with the ground truth score initialization (GT), is found in Figure 7, where we present the results for these cases in terms of source separation

Read more

Summary

Introduction

Western classical music is a centuries-old heritage traditionally driven by well-established practices. The resulting factorized spectrogram is calculated as a linear combination of the template vectors with a set of weight vectors forming the activation matrix This representation allows for parametric models such as the source-filter model [3, 18, 20] or the multiexcitation model [9], which can capture important traits of harmonic instruments and help separate between them, as it is the case with orchestral music. The multiexcitation model has been evaluated in a restricted scenario of Bach chorales played by a quartet [4] and for this particular database has been extended in the scope of close-microphone recordings [6] and score-informed source separation [11].

Proposed Approach Overview
Baseline Method for Multichannel Source Separation
Gains Initialization with Score Information
PARAFAC Model for Multichannel Gains Estimation
Materials and Evaluation
Evaluation Methodology
Results
Methods
Applications
Outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call