Audiovisual integration in multimedia communications based on MPEG-4 facial animation

Z S Bojkovic,D A Milovanovic

doi:10.1007/bf01201405

Abstract

Recent progress in audiovisual research shows that joint processing of audio and video provides advantages that are not available when the audio and video are processed independently. The importance of the MPEG-4 standard is in specifying an object-based audiovisual representation framework, integrating both natural and synthetic content. MPEG-4 includes a facial animation parameters (FAP) set that enables model-based representation of natural or synthetic talking-head sequences and allows intelligible visual reproduction of facial expressions, emotions, and speech pronunciations at the receiver. Coding methods for bit-rate reduction of FAPs, which make possible the transmission of multiple talking heads over band-limited channels, are reviewed in this paper. Further, we emphasis relationships between natural/synthetic audiovideo coding from the point of view integration of face animation with natural video. Within MPEG-4, a binary format for scene (BIFS) description framework offers a parametric methodology for scene structure representation and efficient coding for transmission or storage. Also, we address the MPEG-4 profiling strategy in facial animation, which guarantees that the standard can provide adequate solutions for applications in multimedia communications.

Full Text