Sound Classification Using Residual Convolutional Network

Mahesh Jangid,Kabir Nagpal

doi:10.1007/978-981-16-2641-8_23

Abstract

AbstractIn this paper, we proposed a new architecture for environmental sound classification on the ESC-50 and urban sound dataset. The ESC-50 dataset is a collection of 2000 labeled environmental audio recordings and the urban sound dataset is a collection of 8732 labeled sound records. The Mel frequency cepstral has been used to obtain the power spectrum of the sound wave. The resulting matrix, made possible the use of the convolutional neural network architecture over the dataset. The new architecture extracts far more complex features repeatedly, while being able to carry it along a greater depth using a ResNet type architecture. After the fine-tuned network, we achieved 89.5% validation accuracy on environmental classification dataset and 96.76% on urban sound dataset.KeywordsConvolutional neural networkResidual neural networkEnvironmental sound classificationMel frequency cepstral

Full Text