An adaptive band-partitioning spectral entropy based speech detection in realistic noisy environments

Kun-Ching Wang

doi:10.21437/interspeech.2004-346

Abstract

Generally, the feature parameters used for speech detection are highly sensitive to the environment. The performance of speech detection is severely degraded under realistic noisy environments since the characteristics of a speech signal cannot be fully expressed by those feature parameters. As a result, this study seeks the acoustic fingerprints of speech spectrogram as a robust feature to distinguish a speech from a non-speech, especially in adverse environments, and the fact that the frequency energies of difference types of noise are concentrated on different frequency bands [12], an ABSE (Adaptive Band-partitioning Spectral Entropy)-based speech detection algorithm is proposed to detect speech signals in adverse environments. Additionally, the ABSE-based algorithm is demonstrated to work in real-time with minimal processing delay. Experimental results indicate that the ABSE parameter is very effective for several SNRs (Signal to Noise Ratios) and various noise conditions. Furthermore, the proposed ABSE-based algorithm outperforms other approaches and is reliable in a real car.

Full Text