Classifier Architectures for Acoustic Scenes and Events: Implications for DNNs, TDNNs, and Perceptual Features from DCASE 2016

Jens Schroder,Niko Moritz,Jorn Anemuller,Stefan Goetze,Birger Kollmeier

doi:10.1109/taslp.2017.2690569

Abstract

This paper evaluates neural network NN based systems and compares them to Gaussian mixture model GMM and hidden Markov model HMM approaches for acoustic scene classification SC and polyphonic acoustic event detection AED that are applied to data of the “Detection and Classification of Acoustic Scenes and Events 2016” DCASE'16 challenge, task 1 and task 3, respectively. For both tasks, the use of deep neural networks DNNs and features based on an amplitude modulation filterbank and a Gabor filterbank GFB are evaluated and compared to standard approaches. For SC, additionally a time-delay NN approach is proposed that enables analysis of long contextual information similar to recurrent NNs but with training efforts comparable to conventional DNNs. The SC system proposed for task 1 of the DCASE'16 challenge attains a recognition accuracy of 77.5%, which is 5.6% higher compared to the DCASE'16 baseline system. For the AED task, DNNs are adopted in tandem and hybrid approaches, i.e., as part of HMM-based systems. These systems are evaluated for the polyphonic data of task 3 from the DCASE'16 challenge. Several strategies to address the issue of polyphony are considered. It is shown that DNN-based systems perform less accurate than the traditional systems for this task. Best results are achieved using GFB features in combination with a multiclass GMM-HMM back end.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Classifier Architectures for Acoustic Scenes and Events: Implications for DNNs, TDNNs, and Perceptual Features from DCASE 2016

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jun 1, 2017
Citations: 21

Similar Papers

Acoustic Scene Classification Using Reduced MobileNet Architecture
Jun-Xiang Xu ... Tzu-Ching Lin
-
Jun-Xiang Xu, et. al.Jun-Xiang Xu ... Tzu-Ching Lin
01 Dec 2018
01 Dec 2018

Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
Annamaria Mesaros ... Emmanouil Benetos
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 26
Annamaria Mesaros, et. al.Annamaria Mesaros ... Emmanouil Benetos
28 Nov 2017
Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
Annamaria Mesaros ... Emmanouil Benetos

Creating a new research community on detection and classification of acoustic scenes and events: Lessons from the first ten years of DCASE challenges and workshops
Mark Plumbley ... Tuomas Virtanen
INTER-NOISE and NOISE-CON Congress and Conference Proceedings | VOL. 265
Mark Plumbley, et. al.Mark Plumbley ... Tuomas Virtanen
01 Feb 2023
INTER-NOISE and NOISE-CON Congress and Conference Proceedings | VOL. 265

On the use of spectro-temporal features for the IEEE AASP challenge ‘detection and classification of acoustic scenes and events’
Jens Schroder ... Stefan Goetze
-
Jens Schroder, et. al.Jens Schroder ... Stefan Goetze
01 Oct 2013
01 Oct 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classifier Architectures for Acoustic Scenes and Events: Implications for DNNs, TDNNs, and Perceptual Features from DCASE 2016

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing