A unified speaker-dependent speech separation and enhancement system based on deep neural networks

Tian Gao,Cong Liu,Li Xu,Li-Rong Dai,Jun Du,Chin-Hui Lee

doi:10.1109/chinasip.2015.7230492

Tian Gao, Cong Liu + Show 4 more

https://doi.org/10.1109/chinasip.2015.7230492

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Speech enhancement and speech separation are important frontends of many speech processing systems. In real tasks, the background noises are often mixed with some human voice interferences. In this paper, we explore a framework to unify speech enhancement and speech separation for a speaker-dependent scenario based on deep neural networks (DNNs). Using a supervised method, DNN is adopted to directly model a nonlinear mapping function between noisy and clean speech signals. The signals of speaker interferers are considered as one type of universal noise signals in our framework. In order to be able to handle a wide range of additive noise in the real-world situations, a large training set that encompasses many possible combinations of speech and noise types, is designed. Experimental results demonstrate that the proposed framework can get the comparable performances to those single speech enhancement or separation systems. Furthermore, the resulting DNN model, trained with artificial synthesized data, is also effective in dealing with noisy speech data recorded in real-world conditions.

Full Text