Abstract

A drum is a polyphonic instrument composed of several instruments such as Snare, Symbol, Tom, and Hi-hat. Each constituent instrument has a different sound (pitch, tone), and it sounds differ depending on the playing method, tuning condition, drum manufacturer, and whether it is an acoustic or electronic drum. In addition, it is an instrument in which frequencies are overlapped between the instruments constituting the drum. For this reason, in order to contribute to the improvement of the drum sound recognition rate that previous researches were not high, this paper attempted to discriminate the sound of electronic and acoustic drums, which had not been studied. This paper used the IDMT-SMT-Drums data set (https://www.idmt.fraunhofer.de/en/business_units/m2d/smt/drums.html), which was already used in several previous studies as the learning and test data of this study, and 70% of the drum sound WAV files of the data set were used for learning and 30% used for testing. For the analysis, a Convolutional Neural Network (CNN) was used, and WAV files, which are drum sound files, were converted into spectrogram files (Image files) using Mel Spectrogram technology for input of CNN. As a result of this study, the recognition rate was 78–92% for each type of hyper-parameter application. It was found that the pixel size of the input spectrogram and the number of CNN filters affect the recognition rate. In future research, it is necessary to study the improvement of recognition rate by applying various hyper-parameters by securing additional learning and test data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call