Multichannel Audio Source Separation With Deep Neural Networks

Aditya Arie Nugraha,Antoine Liutkus,Emmanuel Vincent

doi:10.1109/taslp.2016.2580946

Abstract

This article addresses the problem of multichannel audio source separation. We propose a framework where deep neural networks (DNNs) are used to model the source spectra and combined with the classical multichannel Gaussian model to exploit the spatial information. The parameters are estimated in an iterative expectation-maximization (EM) fashion and used to derive a multichannel Wiener filter. We present an extensive experimental study to show the impact of different design choices on the performance of the proposed technique. We consider different cost functions for the training of DNNs, namely the probabilistically motivated Itakura–Saito divergence, and also Kullback–Leibler, Cauchy, mean squared error, and phase-sensitive cost functions. We also study the number of EM iterations and the use of multiple DNNs, where each DNN aims to improve the spectra estimated by the preceding EM iteration. Finally, we present its application to a speech enhancement problem. The experimental results show the benefit of the proposed multichannel approach over a single-channel DNN-based approach and the conventional multichannel nonnegative matrix factorization-based iterative EM algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multichannel Audio Source Separation With Deep Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jun 21, 2016
Citations: 311

Similar Papers

Deep Neural Network Based Multichannel Audio Source Separation
Aditya Arie Nugraha ... Antoine Liutkus
-
Aditya Arie Nugraha, et. al.Aditya Arie Nugraha ... Antoine Liutkus
01 Jan 2018
01 Jan 2018

An effective EM algorithm for mixtures of Gaussian processes via the MCMC sampling and approximation
Di Wu ... Jinwen Ma
Neurocomputing | VOL. 331
Di Wu, et. al.Di Wu ... Jinwen Ma
23 Nov 2018
Neurocomputing | VOL. 331

An evaluation of the accelerated expectation maximization algorithms for single-photon emission tomography image reconstruction
Kenya Murase ... Yoshifumi Sugawara
European Journal of Nuclear Medicine | VOL. 21
Kenya Murase, et. al.Kenya Murase ... Yoshifumi Sugawara
01 Jul 1994
European Journal of Nuclear Medicine | VOL. 21

Square root update acceleration of the EM algorithm in Gaussian mixture processes
Isamu Shioya ... Takao Miura
-
Isamu Shioya, et. al.Isamu Shioya ... Takao Miura
01 Aug 2011
01 Aug 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multichannel Audio Source Separation With Deep Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing