Abstract

The aim of this paper is to study how contrastive focus is conveyed by prosody both articulatorily and acoustically and how viewers extract focus structure from visual prosodic realizations. Is the visual modality useful for the perception of prosody? An audiovisual corpus was recorded from a male native speaker of French. The sentences had a subject–verb–object (SVO) structure. Four contrastive focus conditions were studied: focus on each phrase (S, V or O) and broad focus. Normal and reiterant modes were recorded, only the latter was studied. An acoustic validation (fundamental frequency, duration and intensity) showed that the speaker had pronounced the utterances with a typical focused intonation on the focused phrase. Then, lip height and jaw opening were extracted from the video data. An articulatory analysis suggested a set of possible visual cues to focus for reiterant /ma/ speech: (a) prefocal lengthening, (b) large jaw opening and high opening velocities on all the focused syllables; (c) long lip closure for the first focused syllable and (d) hypo-articulation (reduced jaw opening and duration) of the following phrases. A visual perception test was developed. It showed that (a) contrastive focus was well perceived visually for reiterant speech; (b) no training was necessary and (c) subject focus was slightly easier to identify than the other focus conditions. We also found that if the visual cues identified in our articulatory analysis were present and marked, perception was enhanced. This enables us to assume that the visual cues extracted from the corpus are probably the ones which are indeed perceptively salient.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call