Abstract

This study describes the detection of nasal closure and nasal release landmarks, as part of a larger system for speech recognition based on acoustic cues. Landmarks are produced as a result of closures and releases in the oral region and are indicated by abrupt changes in the speech signal. Nasal closure and release landmarks have proven particularly challenging to detect and are the focus of this report. The process for implementing the nasal detection module includes extracting and processing a set of speech-related measurements, such as formant frequencies, spectral band energies, and their derivatives, from a large database of labeled speech files, and determining which of these measurements are potentially effective, using ANOVA analysis. Next, Gaussian mixture models are trained and tested on these measurements to classify nasal closures, nasal releases, and all other landmark cues. The resulting nasal closure and release landmark detection module will be used with other landmark modules for vowels, glides, fricative closures/releases, and stop closures/releases, as well as other acoustic cues to place and voicing, in the overall speech recognition system. The current performance of the module will be assessed and discussed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.