Development and perceptual assessment of a synthesizer of disordered voices

Samia Fraj,Francis Grenez,Jean Schoentgen

doi:10.1121/1.4751536

Abstract

A synthesizer is based on a nonlinear wave-shaping model of the glottal area, an algebraic model of the glottal aerodynamics as well as concatenated-tube models of the trachea and vocal tract. Voice disorders are simulated by way of models of vocal frequency jitter and tremor, vocal amplitude shimmer and tremor, as well as pulsatile additive noise. Six experiments have been carried out to assess the synthesizer perceptually. Three experiments involve the perceptual categorization of male synthetic and human stimuli and one the auditory discrimination between synthetic and human tokens. A fifth experiment reports the auditory discrimination between synthetic tokens with different levels of additive and modulation noise. A sixth experiment reports the scoring by expert listeners of male synthetic stimuli on equal-appearing interval scales grade-roughness-breathiness (GRB). A first objective is to demonstrate the ability of the synthesizer to simulate vowel sounds that are valid exemplars of speech sounds produced by humans with voice disorders. A second objective is to learn how human expert raters perceptually map vocal frequency, additive and modulation noise as well as vowel categories into scores on GRB scales.

Full Text