Exploring multi-channel features for denoising-autoencoder-based speech enhancement

Shoko Araki,Masakiyo Fujimoto,Marc Delcroix,Tomohiro Nakatani,Tomoki Hayashi,Kazuya Takeda

doi:10.1109/icassp.2015.7177943

Abstract

This paper investigates a multi-channel denoising autoencoder (DAE)-based speech enhancement approach. In recent years, deep neural network (DNN)-based monaural speech enhancement and robust automatic speech recognition (ASR) approaches have attracted much attention due to their high performance. Although multi-channel speech enhancement usually outperforms single channel approaches, there has been little research on the use of multi-channel processing in the context of DAE. In this paper, we explore the use of several multi-channel features as DAE input to confirm whether multi-channel information can improve performance. Experimental results show that certain multi-channel features outperform both a monaural DAE and a conventional time-frequency-mask-based speech enhancement method.

Full Text