Abstract
The problem of source separation and noise reduction using multiple microphones is addressed. The minimum mean square error (MMSE) estimator for the multispeaker case is derived and a novel decomposition of this estimator is presented. The MMSE estimator is decomposed into two stages: first, a multispeaker linearly constrained minimum variance (LCMV) beamformer (BF); and second, a subsequent multispeaker Wiener postfilter. The first stage separates and enhances the signals of the individual speakers by utilizing the spatial characteristics of the speakers [as manifested by the respective acoustic transfer functions (ATFs)] and the noise power spectral density (PSD) matrix, while the second stage exploits the speakers’ PSD matrix to reduce the residual noise at the output of the first stage. The output vector of the multispeaker LCMV BF is proven to be the sufficient statistic for estimating the marginal speech signals in both the classic sense and the Bayesian sense. The log spectral amplitude estimator for the multispeaker case is also derived given the multispeaker LCMV BF outputs. The performance evaluation was conducted using measured ATFs and directional noise with various signal-to-noise ratio levels. It is empirically verified that the multispeaker postfilters are beneficial in terms of signal-to-interference plus noise ratio improvement when compared with the single-speaker postfilter.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have