Abstract

AbstractIn this paper, we proposed a new architecture for environmental sound classification on the ESC-50 and urban sound dataset. The ESC-50 dataset is a collection of 2000 labeled environmental audio recordings and the urban sound dataset is a collection of 8732 labeled sound records. The Mel frequency cepstral has been used to obtain the power spectrum of the sound wave. The resulting matrix, made possible the use of the convolutional neural network architecture over the dataset. The new architecture extracts far more complex features repeatedly, while being able to carry it along a greater depth using a ResNet type architecture. After the fine-tuned network, we achieved 89.5% validation accuracy on environmental classification dataset and 96.76% on urban sound dataset.KeywordsConvolutional neural networkResidual neural networkEnvironmental sound classificationMel frequency cepstral

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.