Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction

Florian Henkel,Gerhard Widmer

doi:10.3389/fcomp.2021.718340

Abstract

The task of real-time alignment between a music performance and the corresponding score (sheet music), also known as score following, poses a challenging multi-modal machine learning problem. Training a system that can solve this task robustly with live audio and real sheet music (i.e., scans or score images) requires precise ground truth alignments between audio and note-coordinate positions in the score sheet images. However, these kinds of annotations are difficult and costly to obtain, which is why research in this area mainly utilizes synthetic audio and sheet images to train and evaluate score following systems. In this work, we propose a method that does not solely rely on note alignments but is additionally capable of leveraging data with annotations of lower granularity, such as bar or score system alignments. This allows us to use a large collection of real-world piano performance recordings coarsely aligned to scanned score sheet images and, as a consequence, improve over current state-of-the-art approaches.

Highlights

Score following or real-time audio-to-score alignment aims at synchronizing musical performances to the corresponding scores in an on-line fashion
Approaches to score following are mainly categorized into methods that require symbolic computer-readable score representations (e. g., Dynamic Time Warping (DTW) or Hidden Markov Models) and those that directly work with images of scores by applying deep learning techniques
In the third section (III) we investigate the generalization in the image domain by considering scanned sheet images and synthetic audio, rendered from the score MIDI

Summary

INTRODUCTION

Score following or real-time audio-to-score alignment aims at synchronizing musical performances (audio) to the corresponding scores (the printed sheet music from which the musicians are presumably playing) in an on-line fashion. In addition to the intrinsic difficulty of this task, with different ways in which the same musical passage can be typeset and played, we face a severe data problem: training such a network requires large amounts of fine-grained annotations between note positions on the sheet image and in the audio. Obtaining such information at this level of precision via manual annotation is factually impossible, at least in the acoustic domain. We conduct large-scale experiments to investigate the generalization capabilities of our proposed system in the audio as well as in the sheet-image domain

RELATED WORK

MULTI-MODAL BOUNDING BOX PREDICTION FOR SCORE FOLLOWING

Conditional YOLO

Query Encoding

A NEW APPROACH

EXPERIMENT SETUP

Models

Dedicated Datasets

Training Details

Evaluation Metrics

Model Comparison

Error Analysis

CONCLUSION

DATA AVAILABILITY STATEMENT

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Computer Science	Publication Date: Nov 24, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Computer Science

Lead the way for us

Similar Papers

Geometric Multimodal Learning Based on Local Signal Expansion for Joint Diagonalization
Maysam Behmanesh ... Christian Jutten
IEEE Transactions on Signal Processing | VOL. 69
Maysam Behmanesh, et. al.Maysam Behmanesh ... Christian Jutten
01 Jan 2020
IEEE Transactions on Signal Processing | VOL. 69

Score Following as a Multi-Modal Reinforcement Learning Problem
Florian Henkel ... Matthias Dorfer
Transactions of the International Society for Music Information Retrieval | VOL. 2
Florian Henkel, et. al.Florian Henkel ... Matthias Dorfer
20 Nov 2019
Transactions of the International Society for Music Information Retrieval | VOL. 2

Multi-modal Conditional Bounding Box Regression for Music Score Following
Florian Henkel ... Gerhard Widmer
-
Florian Henkel, et. al.Florian Henkel ... Gerhard Widmer
23 Aug 2021
23 Aug 2021

Learning to Read and Follow Music in Complete Score Sheet Images
...
-
, et. al. ...
11 Oct 2020
11 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Computer Science