Abstract

Frequency-domain blind source separation (BSS) is efficient for separating convolutive speeches by reducing time-domain convolutive mixtures to instantaneous mixtures of complex-valued speeches at each frequency bin, but suffers from permutation ambiguity. Considering that the semi-blind complex kurtosis maximization (KM) algorithm can separate complex-valued signals in a fixed order by incorporating magnitude priors about the sources as references, we here apply it to perform speech separation in frequency domain. As the closeness measure between the BSS estimate and the reference is vital for the semi-blind KM algorithm to extract a specific source when the reference is determined, we examine two different closeness measures in this study. One is based on magnitude of the reference that is originally used by the semi-blind KM algorithm, and the other is based on energy of the reference. We define a distance between the source of interest and the others in terms of the closeness measure, and compare the distances for frequency-domain speech signals and the performances of speech separation by using the two closeness measures. The results demonstrate that the distance using the new closeness measure is larger than that using the original one due to energy matching between the estimate and the reference, and the semi-blind KM using the new closeness measures obtains better performance for frequency-domain speech separation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call