Abstract

Most verbal communications use cues from both the vi- sual and acoustic modalities to convey messages. During the pro- duction of speech, the visible information provided by the external articulatory organs can influence the understanding of the language, by interpreting the combined information into meaningful linguistic expressions. The task of integrating speech and image data to emu- late the bimodal human interaction system can be depicted by de- veloping automated systems. These systems have a wide range of applications such as videophone systems, where the interdepen- dencies between image and speech signals can be exploited for data compression and in solving the task of lip synchronization which has been a major problem. Therefore the objective of this work is to investigate and quantify this relationship such that the knowledge gained will assist in longer term multimedia and video- phone research. © 1999 SPIE and IS&T. (S1017-9909(99)00703-5)

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.