We propose a pseudo-determined blind source separation framework that exploits the information from a large number of microphones in an ad-hoc network to extract and enhance sound sources in a reverberant scenario. After compensating for the time offsets and sampling rate mismatch between (asynchronous) signals, we interpret as a determined $M\times M$ mixture the over-determined $M\times N$ mixture, where $M>N$ is the number of microphones and $N$ is the number of sources. Next, we propose a pseudodetermined mixture model that can apply an $M\times M$ independent component analysis (ICA) directly to the $M$ -channel recordings. Moreover, we propose a reference-based permutation alignment scheme that aligns the permutation of the ICA outputs and classifies them into target channels, which contain the $N$ sources, and nontarget channels, which contain reverberation residuals. Finally, using the signals from nontarget channels, we estimate in each target channel the power spectral density of the noise component that we suppress with a spectral postfilter. Interestingly, we also obtain late-reverberation suppression as by-product. Experiments show that each processing block improves incrementally source separation and that the performance of the proposed pseudodetermined separation improves as the number of microphones increases.
Read full abstract