Abstract

Speech endpoint detection is one of the key problems in the practical application of speech recognition system. In this paper, speech signal contained chirp is decomposed into several intrinsic mode function (IMF) with the method of ensemble empirical mode decomposition (EEMD). At the same time, it eliminates the modal mix superposition phenomenon which usually comes out in processing speech signal with the algorithm of empirical mode decomposition (EMD). After that, selects IMFs contained major noise through the adaptive algorithm. Finally, the IMFs and speech signal contained chirp are input into the independent component analysis (ICA) and pure voice signal is separated out. The accuracy of speech endpoint detection can be improved in this way. The result shows that the new speech endpoint detection method proposed above is effective, and has strong anti-noises ability, especially suitable for the speech endpoint detection in low SNR.

Highlights

  • The speech endpoint detection has great significance in speech signal processing

  • There are a great number of speech endpoint detection methods, such as Short-time Energy, Short-time Zero-crossing Rate, Information Entropy, Mel-Frequency Cepstrum Coefficient (MFCC), Hidden Markov Models (HMM), Wavelet Transform technology

  • Since the above methods can not detect speech signals accurately at low signal-to-noise ratio (SNR), in this paper, we provide a method of endpoint detection which based on ensemble empirical mode decomposition (EEMD)[4]

Read more

Summary

Introduction

The speech endpoint detection has great significance in speech signal processing. Accurate speech endpoint detection can improve the accuracy of speech recognition, and reduce the quantity of computational data. There are a great number of speech endpoint detection methods, such as Short-time Energy, Short-time Zero-crossing Rate, Information Entropy, Mel-Frequency Cepstrum Coefficient (MFCC), Hidden Markov Models (HMM), Wavelet Transform technology. These methods still have some defects, especially in low signal-to-noise ratio (SNR) conditions. Since the above methods can not detect speech signals accurately at low SNR, in this paper, we provide a method of endpoint detection which based on ensemble empirical mode decomposition (EEMD)[4]. The decomposition of the data has real physical meanings, and has a higher time-frequency resolution This analysis method will be a great breakthrough in analyzing non-stationary and nonlinear speech signal

Methods
Mechatronics and Information Technology
Speech Endpoint Detection
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call