An Acoustic-Perceptual Study of Vocal Tremor

Supraja Anand,Rahul Shrivastav,Judith M Wingate,Neil N Chheda

doi:10.1016/j.jvoice.2012.02.007

Abstract

Essential tremor of the voice (ETV) is an involuntary intention tremor of the vocal folds that causes fluctuations in fundamental frequency (f(0)) and/or intensity leading to an unsteady voice. There is limited data on how different acoustic variables affect perception of severity of tremor. The purpose of the study was to determine if systematic changes in f(0), rate or modulation frequency (f(f0m)), extent or depth of modulation (d(f0m)), and signal-to-noise ratio (SNR) affect perception of severity of tremor. Vowel phonations of four speakers (two male and two female) with a clinical diagnosis of ETV were selected from the Kay Elemetrics Disordered Voice Database (Lincoln Park, NJ). A high fidelity speech vocoder (STRAIGHT; Kawahara, 1997) was used to synthesize the f(0) contour for each of these voices, which were varied in mean f(0), f(f0m), and d(f0m). The f(0) contour was modified 30 Hz above and below the mean f(0) for each speaker. f(f0m) ranged from 3 to 12 Hz in steps of 3 Hz. d(f0m) ranged from 2 to 32 Hz in steps of 6 Hz. Six (three experts and three naïve) listeners rated the "severity" of tremor on a seven-point rating scale. Significant main effects and interactions were found between the study variables. Perceived severity of tremor increased with f(f0m) and d(f0m). There was no systematic effect of SNR on perceived tremor severity. The perception of severity for steady-state tremor results from a complex interaction of multiple acoustic cues with d(f0m) acting as the primary acoustic cue.

Full Text