Semi-supervised Multichannel Speech Enhancement with Variational Autoencoders and Non-negative Matrix Factorization

Simon Leglaive,Laurent Girin,Radu Horaud

doi:10.1109/icassp.2019.8683704

Semi-supervised Multichannel Speech Enhancement with Variational Autoencoders and Non-negative Matrix Factorization

Simon Leglaive, Laurent Girin + Show 1 more

Open Access

https://doi.org/10.1109/icassp.2019.8683704

Copy DOI

Publication Date: Apr 30, 2019

Citations: 83

Affiliation: French Institute for Research in Computer Science and Automation, Inria Grenoble - Rhône-Alpes research centre, Université Grenoble Alpes

#Framework Of Variational Autoencoders #Non-negative Matrix Factorization + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this paper we address speaker-independent multichannel speech enhancement in unknown noisy environments. Our work is based on a well-established multichannel local Gaussian modeling framework. We propose to use a neural network for modeling the speech spectro-temporal content. The parameters of this supervised model are learned using the framework of variational autoencoders. The noisy recording environment is supposed to be unknown, so the noise spectro-temporal modeling remains unsupervised and is based on non-negative matrix factorization (NMF). We develop a Monte Carlo expectation-maximization algorithm and we experimentally show that the proposed approach outperforms its NMF-based counterpart, where speech is modeled using supervised NMF.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.