Multiple Sound Source Localization, Separation, and Reconstruction by Microphone Array: A DNN-Based Approach

Lei Huang,Weize Sun,Long Chen,Guitong Chen,Yat-Sze Choy

doi:10.3390/app12073428

Abstract

Synchronistical localization, separation, and reconstruction for multiple sound sources are usually necessary in various situations, such as in conference rooms, living rooms, and supermarkets. To improve the intelligibility of speech signals, the application of deep neural networks (DNNs) has achieved considerable success in the area of time-domain signal separation and reconstruction. In this paper, we propose a hybrid microphone array signal processing approach for the nearfield scenario that combines the beamforming technique and DNN. Using this method, the challenge of identifying both the sound source location and content can be overcome. Moreover, the use of a sequenced virtual sound field reconstruction process enables the proposed approach to be quite suitable for a sound field which contains a dominant, stronger sound source and masked, weaker sound sources. Using this strategy, all traceable, mainly sound, sources can be discovered by loops in a given sound field. The operational duration and accuracy of localization are further improved by substituting the broadband weighted multiple signal classification (BW-MUSIC) method for the conventional delay-and-sum (DAS) beamforming algorithm. The effectiveness of the proposed method for localizing and reconstructing speech signals was validated by simulations and experiments with promising results. The localization results were accurate, while the similarity and correlation between the reconstructed and original signals was high.

Full Text