Abstract
Prior work has shown that the mouth area can yield articulatory features of speech segments and durational information (Navarra et al., 2010), while pitch and speech amplitude, are cued by the eyebrows and other head movements (Hamarneh et al., 2019). It has been reported that adults will look more at the mouth when evaluating speech information in a non-native language (Barenholtz et al., 2016). In the present study, we ask how listeners' visual scanning of a talking face is affected by task demands that specifically target prosodic and segmental information, which has not been examined by the prior work. Twenty-five native English speakers heard two audio sentences in English (the native language) or Mandarin (the non-native language) that might differ in segmental or prosodic information, or even both, and then saw a silent video of a talking face. Their task was to judge whether the video matched either the first or second audio sentence (or whether both sentences were the same).The results show that although looking was generally weighted towards the mouth, reflecting task demands, increased looking to the mouth predicted correct responses only for Mandarin trials. This effect was more pronounced in the Prosody and Both conditions, relative to the Segment condition (p < 0.05). The results suggest a link between mouth-looking and the extraction of speech-relevant information at both prosodic and segmental levels, but only under high cognitive load.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.