Domain Adaptation Neural Network for Acoustic Scene Classification in Mismatched Conditions

Rui Wang,Xiao-Lei Zhang,Mou Wang,Susanto Rahardja

doi:10.1109/apsipaasc47483.2019.9023057

Abstract

Acoustic scene classification is a task of predicting the acoustic environment of an audio recording. Because the training and test conditions in most real world acoustic scene classification problems do not match, it is strongly necessary to develop domain adaptation methods to solve the cross-domain problem. In this paper, we propose a domain adaptation neural network (DANN) based acoustic scene classification (ASC) method. Specifically, we first extract an acoustic feature, i.e. log-Mel spectrogram, which has been proven to be effective in previous studies. Then, we train a DANN to project the training and test domains into one common space where the acoustic scenes are categorized jointly. To boost the overall performance of the proposed method, we further train an ensemble of convolutional neural network (CNN) models with different parameter settings respectively. Finally, we fuse the DANN and CNN models by averaging the outputs of the models. We have evaluated the proposed method on the subtask B of task 1 of the DCASE 2019 ASC challenge, which is a closed-set classification problem whose audio recordings were recorded by mismatched devices. Experimental results demonstrate the effectiveness of the proposed method on the acoustic scene classification problem in mismatched conditions.

Full Text