Deep Learning-Based Optical Music Recognition for Semantic Representation of Non-overlap and Overlap Music Notes

Rana L. Abdulazeez,Fattah Alizadeh

doi:10.14500/aro.11402

Abstract

In the technology era, the process of teaching a computer to interpret musical notation is termed optical music recognition (OMR). It aims to convert musical note sheets presented in an image into a computer-readable format. Recently, the sequence-to-sequence model along with the attention mechanism (which is used in text and handwritten recognition) has been used in music notes recognition. However, due to the gradual disappearance of excessively long sequences of musical sheets, the mentioned OMR models which consist of long short-term memory are facing difficulties in learning the relationships among the musical notations. Consequently, a new framework has been proposed, leveraging the image segmentation technique to break up the procedure into several steps. In addition, an overlap problem in OMR has been addressed in this study. Overlapping can result in misinterpretation of music notations, producing inaccurate findings. Thus, a novel algorithm is being suggested to detect and segment the notations that are extremely close to each other. Our experiments are based on the usage of the Convolutional Neural Network block as a feature extractor from the image of the musical sheet and the sequence-to-sequence model to retrieve the corresponding semantic representation. The proposed approach is evaluated on The Printed Images of Music Staves dataset. The achieved results confirm that our suggested framework successfully solves the problem of long sequence music sheets, obtaining SER 0% for the non-overlap symbols in the best scenario. Furthermore, our approach has shown promising results in addressing the overlapping problem: 23.12 % SER for overlapping symbols.

Full Text