Abstract

Vocal folds are a pair of twin membranous structure essential for phonation during speech production. Vocal folds vary in terms of length and thickness from infants to adults and also differ across gender. This difference in vocal folds is responsible for the pitch variation across speakers. Estimating actual vocal fold length is a formidable task due to their location and also because the folds are longer (due to tension within cartilage muscles) when open than when they are closed. In this paper, we focus on estimating length of the vocal folds using fundamental frequency (F 0 ) contour obtained by Fujisaki model and the relationship of F 0 with stress and tissue density. The novelty of the proposed method lies in estimating the vocal fold length from speech signal alone. The proposed method uses strength of excitation (SOE) at glottal closure instants (GCIs) as an indirect correlate of stress along the vocal folds during phonation. Results are shown using CMU-ARCTIC database and infant cry data. It was observed that the proposed method captured the variations in vocal fold length across gender as well as across age. The authors believe that this is the first study of its kind reported in the literature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call