Abstract

Robustness to ambient noise, varying vocal effort, and availability of only short-duration test utterances represent big challenges for developers of automated speech-enabled applications. Recent studies have proposed the use of vocal effort-matched speaker models as a potential solution to such challenges. However, detecting whispered speech in extremely noisy environments is not a trivial task. This letter proposes the use of auditory-inspired modulation spectral-based features as a method of separating speech from environment-based components, thus resulting in accurate whispered speech detection at signal-to-noise ratios as low as 0 dB. Experimental results show the proposed detection algorithm outperforming two benchmark approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call