A multichannel subspace approach with signal presence probability for speech enhancement

Jungpyo Hong

doi:10.1007/s11045-019-00640-z

Abstract

For the last few decades, speech enhancement based on microphone arrays has primarily utilized prior information about system models, e.g., array geometry and source location. However, estimation of the time delay to align microphone inputs is largely affected by reverberation and microphone mismatch. Preprocessing time aligning, e.g., fixed beamforming (the first branch of the generalized sidelobe canceller), is not desirable in general applications. Recently, interest has shifted to linear filtering, which works with only second-order statistics of noisy input and estimated noise. This paper proposes a linear filter design based on a multichannel subspace approach for speech enhancement. The contribution of the proposed multichannel subspace methods is threefold. First, a linear filter is applied to the multichannel frequency domain using a spatiospectral correlation matrix. Next, three types of multichannel signal presence probability (MC-SPP) are derived in the subspace domain. Third, incorporating the MC-SPPs into the gain modification of the linear filter achieves further improved noise reduction performance. Of the gain modifications, the proposed gain modification with subspace probability related to the eigenvector corresponding to the maximum eigenvalue realized the best noise reduction performance. The evaluation on average improved the proposed subspace-based methods by approximately 4 dB in overall SNR while maintaining a similar cepstral distance measured over the minimum variance distortionless response with the state-of-the-art relative transfer function estimation in adverse noisy environments.

Full Text