Psychoacoustic roughness as creaky voice predictor

Julián Villegas,Seunghun J Lee,Jeremy Perkins

doi:10.1121/1.4970870

Abstract

The use of psychoacoustic roughness as a predictor of creaky voice is reported. Roughness, a prothetic sensation elicited by rapid changes in the temporal envelop of a sound (15-300 Hz), shares qualitative similarities with a kind of phonation known as vocal fry or creakiness. When a creakiness classification made by trained linguists was used as a reference, a classifier based on an objective temporal roughness model yielded results similar to an artificial neural network-based predictor of creakiness, but the former classifier tended to produce more type I errors. We also compare the results of the roughness-based prediction with those predicted by samples of three populations who use creakiness contrastively in different degrees: Japanese (where creakiness is not systematically used for phonetic contrast), Mandarin (where creakiness is used as a secondary cue), and Vietnamese (where creakiness is used as a phonetic contrast between tones). The roughness-based classification seems to better agree with classifications made by the untrained listeners. Our findings suggest that extreme roughness values (&gt;4 asper) in combination with local prominences on the roughness temporal profile of vocalic segments could be used for classification of creaky intervals in running speech.

Full Text