Mouth shape synthesizing

Masahide Kaneko

doi:10.1121/1.1500920

Abstract

A picture synthesizing apparatus, and method for synthesizing a moving picture of a person’s face having mouth-shape variations from a train of input characters, wherein the method steps comprise developing from the train of input character a train of phonemes, utilizing a speech synthesis technique outputting, for each phoneme, a corresponding vocal sound feature including articulation mode and its duration of each corresponding phoneme of the train of phonemes. Determining for each phoneme a mouth-shape feature corresponding to each phoneme on the basis of the corresponding vocal sound feature, the mouth-shape feature including the degree of opening of the mouth, the degree of roundness of the lips, the height of the lower jaw in a raised and a lowered position, and the degree to which the tongue is seen. Determining values of mouth-shape parameters, for each phoneme, for representing a concrete mouth-shape on the basis of the mouth-shape feature; and controlling the values of the mouth-shape parameters for each phoneme, for each frame of the moving picture in accordance with the duration of each phoneme, thereby synthesizing the moving picture having mouth-shape variations matched with a speech output audible in case of reading the train of input characters.

Full Text