A Speech Endpoint Detection Method Based on Cascaded Speech Enhancement

Qiang Wu,Yang Liu

doi:10.1109/iceitsa54226.2021.00010

Abstract

In order to improve the robustness of speech endpoint detection under the condition of low SNR, a new method of speech endpoint detection based on cascaded speech enhancement is proposed. Firstly, the gain factor is introduced to optimize the speech enhancement network. By using the strategy of cascading optimization, the L2 loss function regression task of speech enhancement and the cross-entropy loss function classification task of speech endpoint detection are combined into a joint classification task based on cross-entropy, which makes the network have both the function of speech endpoint detection and speech enhancement. Experimental results show that speech enhancement can improve the accuracy of speech endpoint detection, and the proposed method retains more information conducive to speech endpoint detection. Under the condition of low SNR, the proposed method achieves better results.

Full Text