Abstract

Inspired by Ken Stevens’ work on speech production, we investigated the role of the subglottal resonances (SGRs) in several machine-based speech processing algorithms. The subglottal acoustic system refers to the acoustic system below the glottis, which consists of the trachea, bronchi, and lungs. The work is based on the observations that the first and second subglottal resonances (Sg1 and Sg2) form phonological vowel feature boundaries, and that SGRs, especially Sg2, are fairly constant for a given speaker across phonetic contexts and languages. After collecting an acoustic and subglottal database of 50 adults and 48 children, analyzing the database, and developing algorithms to robustly estimate SGRs, we were able to use these resonances to improve the performance of a variety of speech processing algorithms including: recognition of children's speech in limited-data situations, frequency-warping for adult speech recognition, speaker recognition and speaker verification, and automatic height estimation. [Work supported in part by the NSF.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call