Semi-Supervised Source Localization in Reverberant Environments With Deep Generative Modeling

Michael J Bianco,Sharon Gannot,Peter Gerstoft,Efren Fernandez-Grande

doi:10.1109/access.2021.3087697

Michael J Bianco, Sharon Gannot + Show 2 more

Open Access

https://doi.org/10.1109/access.2021.3087697

Copy DOI

Abstract

Localization in reverberant environments remains an open challenge. Recently, supervised learning approaches have demonstrated very promising results in addressing reverberation. However, even with large data volumes, the number of labels available for supervised learning in such environments is usually small. We propose to address this issue with a semi-supervised learning (SSL) approach, based on deep generative modeling. Our chosen deep generative model, the variational autoencoder (VAE), is trained to generate the phase of relative transfer functions (RTFs) between microphones. In parallel, a direction of arrival (DOA) classifier network based on RTF-phase is also trained. The joint generative and discriminative model, deemed VAE-SSL, is trained using labeled and unlabeled RTF-phase sequences. In learning to generate and classify the sequences, the VAE-SSL extracts the physical causes of the RTF-phase (i.e., source location) from distracting signal characteristics such as noise and speech activity. This facilitates effective end-to-end operation of the VAE-SSL, which requires minimal preprocessing of RTF-phase. VAE-SSL is compared with two signal processing-based approaches, steered response power with phase transform (SRP-PHAT) and MUltiple SIgnal Classification (MUSIC), as well as fully supervised CNNs. The approaches are compared using data from two real acoustic environments - one of which was recently obtained at Technical University of Denmark specifically for our study. We find that VAE-SSL can outperform the conventional approaches and the CNN in label-limited scenarios. Further, the trained VAE-SSL system can generate new RTF-phase samples which capture the physics of the acoustic environment. Thus, the generative modeling in VAE-SSL provides a means of interpreting the learned representations. To the best of our knowledge, this paper presents the first approach to modeling the physics of acoustic propagation using deep generative modeling.

Highlights

Source localization is an important problem in acoustics and many related fields
For variational autoencoders (VAEs)-supervised learning (SSL) we find that the system generalizes well to the DTU environment using only few labels, since it extracts physically meaningful features as we will show in Sec
We have proposed a semi-supervised approach to acoustic source localization in reverberant environments based on deep generative modeling with VAEs, which we deem VAESSL

Summary

Introduction

Source localization is an important problem in acoustics and many related fields. The performance of localization algorithms is degraded by reverberation, which induces complex temporal arrival structure at sensor arrays. [1]–[3], acoustic localization in reverberant environments remains a major challenge [4]. There has been great interest in machine learning (ML)-based techniques in acoustics, including source localization and event detection [5]–[14]. One difficulty for ML-based methods in acoustics is the limited amount of labeled data and the complex acoustic propagation in natural environments, despite large volumes of recordings [1], [2]. This limitation has motivated recent approaches for localization based on semi-supervised learning (SSL) [15], [16]

Objectives

Methods

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 23	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Semi-Supervised Source Localization in Reverberant Environments With Deep Generative Modeling

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Semi-supervised source localization in reverberant environments using deep generative modeling
Michael J Bianco ... Sharon Gannot
The Journal of the Acoustical Society of America | VOL. 148
Michael J Bianco, et. al.Michael J Bianco ... Sharon Gannot
01 Oct 2020
The Journal of the Acoustical Society of America | VOL. 148

Semi-Supervised Source Localization with Deep Generative Modeling
Michael J Bianco ... Sharon Gannot
-
Michael J Bianco, et. al.Michael J Bianco ... Sharon Gannot
01 Sep 2020
01 Sep 2020

A grid-free global optimization algorithm for sound sources localization in three-dimensional reverberant environments
Qingbo Zhai ... Baoqing Li
Mechanical Systems and Signal Processing | VOL. 188
Qingbo Zhai, et. al.Qingbo Zhai ... Baoqing Li
07 Dec 2022
Mechanical Systems and Signal Processing | VOL. 188

3D DOA estimation of multiple sound sources based on spatially constrained beamforming driven by intensity vectors
Despoina Pavlidi ... Athanasias Mouchtaris
-
Despoina Pavlidi, et. al.Despoina Pavlidi ... Athanasias Mouchtaris
01 Mar 2016
01 Mar 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semi-Supervised Source Localization in Reverberant Environments With Deep Generative Modeling

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access