Abstract

The goal in the whisper activity detection (WAD) is to find the whispered speech segments in a given noisy recording of whispered speech. Since whispering lacks the periodic glottal excitation, it resembles an unvoiced speech. This noise-like nature of the whispered speech makes WAD a more challenging task compared to a typical voice activity detection (VAD) problem. In this paper, we propose a feature based on the long term variation of the logarithm of the short-time sub-band signal energy for WAD. We also propose an automatic sub-band selection algorithm to maximally discriminate noisy whisper from noise. Experiments with eight noise types in four different signal-to-noise ratio (SNR) conditions show that, for most of the noises, the performance of the proposed WAD scheme is significantly better than that of the existing VAD schemes and whisper detection schemes when used for WAD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call