Abstract

Non-modal phonation conveys both linguistic and paralinguistic information, and is distinguished by acoustic source and filter features. Detecting non-modal phonation in speech requires reliable F0 analysis, a problem for telephone-band speech, where F0 analysis frequently fails. An approach is demonstrated to the detection of creaky phonation in telephone speech based on robust F0 and spectral analysis. The F0 analysis relies on an autocorrelation algorithm applied to the inverse-filtered speech signal and succeeds in regions of non-modal phonation where the non-filtered F0 analysis typically fails. In addition to the extracted F0 values, spectral amplitude is measured at the first two harmonics (H1, H2) and the first three formants (A1, A2, A3). F0 and spectral tilt are measured from 300 samples of modal and creaky voice vowels, selected from Switchboard telephone speech using auditory and visual criteria. Results show successful F0 detection in creaky voice regions, with distinctive low F0, and statistically significant differences between modal and creaky voice in measures of spectral amplitude, especially for measures based on H1. Our current work develops methods for the automatic detection of creaky voicing in spontaneous speech based on the analysis technique shown here. [Work supported by NSF.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call