Abstract

The development of music culture has resulted in a problem called optical music recognition (OMR). OMR is a task in computer vision that explores the algorithms and models to recognize musical notation. This study proposed the stacking ensemble learning model to complete the OMR task using the common western musical notation (CWMN) musical notation. The ensemble learning model used four deep convolutional neural networks (DCNNs) models, namely ResNeXt50, Inception-V3, RegNetY-400MF, and EfficientNet-V2-S as the base classifier. This study also analysed the most appropriate technique to be used as the ensemble learning model’s meta-classifier. Therefore, several machine learning techniques are determined to be evaluated, namely support vector machine (SVM), logistic regression (LR), random forest (RF), K-nearest neighbor (KNN), decision tree (DT), and Naïve Bayes (NB). Six publicly available OMR datasets are combined, down sampled, and used to test the proposed model. The dataset consists of the HOMUS_V2, Rebelo1, Rebelo2, Fornes, OpenOMR, and PrintedMusicSymbols datasets. The proposed ensemble learning model managed to outperform the model built in the previous study and succeeded in achieving outstanding accuracy and F1-scores with the best value of 97.51% and 97.52%, respectively; both of which were achieved by the LR meta-classifier.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.