Deep Convolution Neural Network for Thai Classical Music Instruments Sound Recognition

Sunee Pongpinigpinyo,Apichai Huaysrijan

doi:10.1109/icsec53205.2021.9684611

Abstract

Identifying Thai classical music instruments from recorded sounds is challenge research in the area of Thai music analysis such as Thai music retrieval and Thai automatic transcription. Feature extraction and recognition are the two fundamental steps in sound recognition. This study presents the comparison Mel-Frequency Cepstral Coefficient (MFCC) and Mel-Spectrogram features extraction whereas a deep learning model based on a convolutional neural network (CNN) model is used for Thai Classical Music Instruments Sound Recognition. Thai classical music instruments sound dataset is collected by recording 13 Thai classical music instruments sound. The results of this study show that MFCC outperforms Mel-Spectrogram and the convolutional neural networks model outperforms other methods such as Long Short-Term Memory (LSTM) and Convolutional Recurrent Neural Network (CRNN). The experiment shows that the MFCC with CNN performed the best in the experiment with 99.44 percent of accuracy and 0.99 of F1-score.

Full Text