Abstract

In this paper the influence of hangover and hangbefore criteria on automatic speech recognition is presented. Voice activity detection (VAD) algorithm is nowadays almost always part of automatic speech recognition systems. Hangover and hangbefore criteria can be integrated into VAD algorithm after basic VAD decision. Hangover and hangbefore criteria can improve speech recognition results. However, there is a question, how many frames should be taken for hangover and hangbefore criteria. The duration of vowels, diphthongs and semivowels is important to define how many frames must be detected as speech, so that we can decide if hangover and hangbefore criteria will be used at all. The frames of consonants have low spectral energy. Especially energy of unvoiced fricatives, unvoiced stops and nasals is very low. First, these frames are detected as silence. However, with hangover and hangbefore criteria they are again considered as speech. Speech recognition experiments show that hangover and hangbefore criteria can improve speech recognition results. Speech recognition experiments also show that hangbefore criterion has a more important role in speech recognition than hangover criterion.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call