Blind Separation of Audio Mixtures Through Nonnegative Tensor Factorization of Modulation Spectrograms

Tom Barker,Tuomas Virtanen

doi:10.1109/taslp.2016.2602546

Abstract

This paper presents an algorithm for unsupervised single-channel source separation of audio mixtures. The approach specifically addresses the challenging case of separation where no training data are available. By representing mixtures in the modulation spectrogram (MS) domain, we exploit underlying similarities in patterns present across frequency. A three-dimensional tensor factorization is able to take advantage of these redundant patterns, and is used to separate a mixture into an approximated sum of components by minimizing a divergence cost. Furthermore, we show that the basic tensor factorization can be extended with convolution in time being used to improve separation results and provide update rules to learn components in such a manner. Following factorization, sources are reconstructed in the audio domain from estimated components using a novel approach based on reconstruction masks that are learned using MS activations, and then applied to a mixture spectrogram. We demonstrate that the proposed method produces superior separation performance to a spectrally based nonnegative matrix factorization approach, in terms of source-to-distortion ratio. We also compare separation with the perceptually motivated interference-related perceptual score metric and identify cases with higher performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Blind Separation of Audio Mixtures Through Nonnegative Tensor Factorization of Modulation Spectrograms

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Dec 1, 2016
Citations: 50

Similar Papers

Semi-supervised non-negative tensor factorisation of modulation spectrograms for monaural speech separation
Tom Barker ... Tuomas Virtanen
-
Tom Barker, et. al.Tom Barker ... Tuomas Virtanen
01 Jul 2014
01 Jul 2014

Non-negative tensor factorisation of modulation spectrograms for monaural sound source separation
Tom Barker ... Tuomas Virtanen
-
Tom Barker, et. al.Tom Barker ... Tuomas Virtanen
25 Aug 2013
25 Aug 2013

Advances in Nonnegative Matrix and Tensor Factorization
A Cichocki ... M Mørup
Computational Intelligence and Neuroscience | VOL. 2008
A Cichocki, et. al.A Cichocki ... M Mørup
01 Jan 2008
Computational Intelligence and Neuroscience | VOL. 2008

Audio signal separation through complex tensor factorization: Utilizing modulation frequency and phase information
Shogo Masaya
Signal Processing | VOL. 142
Shogo MasayaShogo Masaya
18 Jul 2017
Signal Processing | VOL. 142

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Blind Separation of Audio Mixtures Through Nonnegative Tensor Factorization of Modulation Spectrograms

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing