Abstract

Quantification of the acoustic characteristics of irregular phonation provides a foundation for automatic detection of regions of irregular phonation in continuous speech. Recent results for automatic classification of regions of phonation as either regular or irregular demonstrate classification rates greater than 90% (false positive <10%) [K. Surana, M.Eng. thesis, MIT, Cambridge, MA 2006]. Similar acoustic cues may be useful in separating subtypes of irregular phonation. Two types of irregular phonation are examined: (1) regions characterized by reduced airflow, assumed to correspond to tightly adducted vocal folds with brief regions of separation, and (2) regions characterized by increased airflow, assumed to correspond to a spread or spreading vocal-fold configuration [J. Slifka, J.Voice (in press)]. Reduced-airflow tokens are extracted, using airflow, audio, and electroglottography signals, from utterance-medial locations, and increased-airflow tokens are all utterance final (20 tokens/speaker, 4 speakers). Surana’s cue set was evaluated for the ability to separate these subtypes of irregular phonation. Preliminary results indicate that an energy difference measure (between tokens smoothed at 6 vs 16 ms) and normalized rms amplitude yield statistically separable populations. Limitations on the use of these cues are discussed including intra- and interspeaker variation and the effect of token duration. [Work supported by NIH DC02978.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call