Voice activity detection system and method

Erol Eryilmaz

doi:10.1121/1.1572366

Abstract

An improved voice activity detection system and method is provided for use in speakerphones and other voice activated systems. To facilitate switching between various operating modes, the voice activity detection scheme utilizes a new voice energy term which is based on an integral of the absolute value of a derivative of a speech signal. Voice activity is detected during a silence mode by comparing a first ratio of a current voice energy value to a background noise value with a voice activity threshold value. Voice activity is detected when the first ratio is greater than the voice activity threshold value. Another step involves identifying a direction of the voice activity during a transmit and receive mode by comparing a second ratio of a transmit path voice energy value to a receive path voice energy value with a transmit threshold value and a receive threshold value. When the second ratio is greater than the transmit threshold value, voice activity is present in the transmit path. Similarly, when the second ratio is less than the receive threshold value, then voice activity is present in the receive path. Following the detection of voice activity in one of the paths, the speakerphone or voice activated system begins transitioning to the applicable mode by gradually suppressing the signal in the other path according to the value of the second ratio.

Full Text