Abstract

Vowel onset point (VOP) and vowel end point (VEP) are the instants of starting and ending of a vowel, respectively. VOPs and VEPs are equally important for accurate detection of vowels and development of different speech based applications. In a single algorithm, simultaneously detecting VOPs and VEPs is very challenging. In this paper, an efficient approach is proposed for robustly extracting the magnitude dynamics at each time instant of the speech signal. The mean and variance of the magnitude dynamics over an analysis frame happen to be significantly higher for the vowels when compared to other nonvowel, silence and noise regions. In this study, the average magnitude dynamics (AMD) over an analysis frame is used as the front-end feature. The AMD values at each time instant are then nonlinearly mapped (NL-AMD) by using sigmoidal function to sharpen the transitions at the VEPs and suppress the variations in the higher magnitude regions. The NL-AMD is equally discriminative at the VOPs and the VEPs. Consequently, most of the VOPs and the VEPs are detected within a smaller deviation. The experimental evaluations presented in this study show that, for the clean as well as noisy test conditions, the proposed feature outperforms the earlier reported front-end features for the task of detecting the VOPs and the VEPs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.