Most frequency-domain blind source separation (BSS) methods are based on the multiplicative narrowband assumption, which is not valid in long reverberation environments. In contrast, convolutive transfer function (CTF)-based BSS methods do not rely on the narrowband assumption, and the separation performance is significantly improved compared to the traditional algorithms in long reverberation environments. However, the CTF-based BSS methods and their variants, e.g., autoregressive (AR) BSS methods, introduce modeling errors to some extent, due to the truncation or approximation during the optimization process. To address this problem, we propose a frequency-domain BSS method employing a hybrid AR and CTF model, which can provide more precise representations of the early reflections and late reverberations. Furthermore, we utilize the Gaussian noise model to deal with the BSS problem in noisy reverberant environments. We formulate the objective function using the maximum log-likelihood criterion, and derive an efficient iterative algorithm for parameter estimation with the block coordinate descent (BCD) method. Experimental results show that the proposed method has a better separation performance than the existing methods in long reverberation environments.
Read full abstract