This study aimed to investigate the development of audiovisual speech perception in monolingual Uzbek-speaking and bilingual Uzbek–Russian-speaking children, focusing on the impact of language experience on audiovisual speech perception and the role of visual phonetic (i.e., mouth movements corresponding to phonetic/lexical information) and temporal (i.e., timing of speech signals) cues. A total of 321 children aged 4 to 10 years in Tashkent, Uzbekistan, discriminated /ba/ and /da/ syllables across three conditions: auditory-only, audiovisual phonetic (i.e., sound accompanied by mouth movements), and audiovisual temporal (i.e., sound onset/offset accompanied by mouth opening/closing). Effects of modality (audiovisual phonetic, audiovisual temporal, or audio-only cues), age, group (monolingual or bilingual), and their interactions were tested using a Bayesian regression model. Overall, older participants performed better than younger participants. Participants performed better in the audiovisual phonetic modality compared with the auditory modality. However, no significant difference between monolingual and bilingual children was observed across all modalities. This finding stands in contrast to earlier studies. We attribute the contrasting findings of our study and the existing literature to the cross-linguistic similarity of the language pairs involved. When the languages spoken by bilinguals exhibit substantial linguistic similarity, there may be an increased necessity to disambiguate speech signals, leading to a greater reliance on audiovisual cues. The limited phonological similarity between Uzbek and Russian might have minimized bilinguals’ need to rely on visual speech cues, contributing to the lack of group differences in our study.
Read full abstract