Abstract

The problem of blind separation of speech signals in the presence of noise using multiple microphones is addressed. Blind estimation of the acoustic parameters and the individual source signals is carried out by applying the expectation-maximization (EM) algorithm. Two models for the speech signals are used, namely an unknown deterministic signal model and a complex-Gaussian signal model. For the two alternatives, we define a statistical model and develop EM-based algorithms to jointly estimate the acoustic parameters and the speech signals. The resulting algorithms are then compared from both theoretical and performance perspectives. In both cases, the latent data (differently defined for each alternative) are estimated in the E-step, where in the M-step, the two algorithms estimate the acoustic transfer functions of each source and the noise covariance matrix. The algorithms differ in the way the clean speech signals are used in the EM scheme. When the clean signal is assumed deterministic unknown, only the a posteriori probabilities of the presence of each source are estimated in the E-step, whereas their time–frequency coefficients are the parameters that are estimated in the M-step using the minimum variance distortionless response beamformer. If the clean speech signals are modeled as complex Gaussian signals, their power spectral densities are estimated in the E-step using the multichannel Wiener filter output. The proposed algorithms were tested using reverberant noisy mixtures of two speech sources in different reverberation and noise conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call