Voice activity detection with array signal processing in the wavelet domain

Yusuke Hioka ,Nobuyuki Hamada

doi:10.5281/zenodo.38027

Abstract

In many conventional voice activity detection (VAD) methods, speech signal is assumed to be acquired in high quality. However, human-machine interface based on speech is usually employed in indoor environment where various interferences exist, therefore, the VAD performance is seriously deteriorated. In this paper, we propose a novel VAD method with array signal processing on wavelet domain, in which we utilize the time, frequency and space information in the speech signal to separate interferences. In the proposed method, speech signal acquired by microphone array is at first decomposed into appropriate subbands with wavelet packet, and then array signal processing is executed on each subbands to realize VAD system for speech signal arriving from particular direction.

Full Text