A Novel Algorithm to Speech Endpoint Detection in Noisy Environments Based on Energy-Entropy Method

Hanmid Dehghani

doi:10.1234/mjee.v2i4.139

Abstract

Endpoint detection, which means distinguishing speech and non- speech segments, is considered as one of the key preprocessing operations in automatic speech recognition (ASR) systems. Usually the energy of speech signal and Zero Crossing Rate (ZCR), are used to locate the beginning and ending for an utterance. Both of these methods have been shown to be effective for endpoint detection. However, especially in a high noise environment they fail. In this paper, we integrate the modified Teager approach with the Energy-Entropy Features. In our new algorithm, the Teager Energy is used to determine crude endpoints, and the Energy-Entropy Features are used to make the final decision. The advantage of this method is that there is no need to estimate the background noise. Therefore, it is very helpful for environments when the beginning or ending noise is very strong or there is not enough “silence” at the beginning or at the end of the utterance. Experimental results on Farsi speech show that the accuracy of this algorithm is quite satisfactory and acceptable for speech endpoints detection.

Full Text