Performance Analysis of Voice Activity Detector in Presence of Non-stationary Noise

Rahul Jaiswal

doi:10.1007/978-981-16-8129-5_10

Abstract

Speech is degraded in the presence of background noise. The need to detect the presence of voiced segments accurately in the degraded signal is crucial for many speech processing applications. This paper addresses the problem of separation of speech and non-speech (noise/silence) segments under non-stationary noisy environments by means of Voice Activity Detector (VAD). A VAD detects the speech and non-speech segments by extracting the speech features and comparing it to a threshold. In this paper, the VAD algorithms are based on two speech features: energy and spectral centroid. NOIZEUS speech corpus containing speech degraded by non-stationary noises at four different SNRs are used. The performance of the VAD algorithms is evaluated using F-score and Euclidean distance with comparison to the Ground truth VAD. Results demonstrate that for different noise conditions tested, a weighted spectral centroid VAD achieves outstanding performance.

Full Text