Blind speech separation using a joint model of speech production

D Smith,J Lukasiak,I Burnett

doi:10.1109/lsp.2005.856869

Abstract

We propose a new blind signal separation (BSS) technique, developed specifically for speech, that exploits a priori knowledge of speech production mechanisms. In our approach, the autoregressive (AR) structure and fundamental frequency (F0) production mechanisms of speech are jointly modeled. We compare the separation performance of our joint AR-F0 algorithm to existing BSS algorithms that model either speech's AR structure or F0 individually. Experimental results indicate that the joint algorithm demonstrates superior separation performance to both the individual AR algorithm (up to 77% improvement) and F0 (up to 50% improvement) algorithms. This suggests that speech separation performance is improved by employing a BSS model with a more realistic description of the speech production process.

Full Text