Abstract

An approach to categorizing CVC syllables is described that partially compensates for durational changes due to factors such as varying speaking rates. The approach is primarily a pattern matching approach. Each CVC syllable in the code book is represented by an elastic template. Each template is analogous to a rubber mask that fits over the spectrogram of a prototypical CVC syllable. In order to maximize the fit to a given sample, a template may be stretched or compressed (in the time dimension). The algorithm penalizes distortions that change the relative positions of spectral features. Global distortions are less severely penalized. Nonlinearities in durational changes are modeled by the equivalent of making the elastic sheet thicker in some places than others. The thicker the elastic the less a spectral feature is affected by global changes in rate. The learning of the prototype templates and the fitting of the templates to samples are implemented using gradient descent. Results are presented on categorizing stylized CVC syllables containing tense and lax vowels and voiced and voiceless stop consonants.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.