Abstract

Automatic music transcription (AMT) is the conversion of audio to music notation, which helps with music education, music production, and music creation. The Thai xylophone is a Thai classical music instrument. Commonly, Thai xylophone has two types of mallets, including soft mallets and hard mallets. This paper proposes the study of AMT for Thai xylophone played with soft mallets. We compared feature extraction using Mel-Spectrogram and Mel-Frequency Cepstral Coefficient (MFCC), as well as deep learning using the Onsets and Frames model (OaF), which is the state of the art for AMT. We collected 30 Thai xylophone played with soft mallets songs with music notation as the dataset. The results show that Mel-Spectrogram outperforms MFCC. The experiment shows that Mel-Spectrogram with the OaF model performed the best on the frame detector with 87.04% of F1-Score and the onset detector with 94.35% of F1-Score. We also conduct ablation research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call