Abstract

Voice quality variability is due to supra-segmental influences but also to segmental factors like phoneme class, vowel quality, nasalization, airstream mechanism etc. These factors determine a rather unexplored micro-prosodic phenomenon: the phone-intrinsic voice quality which causes voice quality coarticulation and voice quality transitions in fluent speech. I subsume all these phenomena under the high-frequency components of prosody. Since high-frequency and low-frequency components (supra-segmentals) of voice quality prosody are superposed and thus encoded the main goal of this paper is to separate them and make both accessible to speech research. In 2003 a holistic voice quality parameter extractor was introduced by Mokhtari, Pfitzinger, and Ishi: It applies a principal components analysis to a database of glottal-flow waveforms for the purpose of later reconstructing and interpolating all underlying glottal-flow waveforms from just a few principal components. Recently, by applying this basic principle to a large corpus of manually segmented glottal-flow waveforms of 44 speakers I dramatically improved its applicability to any speech signals. The application of this version to a large speech database and subsequent high-pass and low-pass filtering of the resulting voice quality parameters yielded phone-intrinsic voice quality parameter sets as well as slowly varying meaningful voice quality parameter contours.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call