Analysis of Deep Neural Network Models for Acoustic Scene Classification

Ahmet Melih Başbuğ,Mustafa Sert

doi:10.1109/siu.2019.8806301

Abstract

Acoustic Scene Classification is one of the active fields of both audio signal processing and machine learning communities. Due to the uncontrolled environment characteristics and the multiple diversity of environmental sounds, the classification of acoustic environment recordings by computer systems is a challenging task. In this study, the performance of deep learning algorithms on acoustic scene classification problem which includes continuous information in sound events are analyzed. For this purpose, the success of the AlexNet and the VGGish based 4- and 8-layered convolutional neural networks utilizing long-short-term memory recurrent neural network (LSTM-RNN) and Gated Recurrent Unit Recurrent Neural Network (GRU-RNN) architectures have been analyzed for this classification task. In this direction, we adapt the LSTM-RNN and the GRU-RNN models with the 4- and 8- layared CNN architectures for the classification. Our experimental results show that 4-layered CNN with GRU structure improve the accuracy.

Full Text