Successful language learning in bilinguals requires the differentiation of two language systems. The capacity to discriminate rhythmically close languages has been reported in 4-month-olds using auditory-only stimuli. This research offers a novel perspective on early language discrimination using audiovisual material. Monolingual and bilingual infants were first habituated to a face talking in the participants’ native language (or the more frequent language in bilingual contexts) and then tested on two successive language switches by the same speaker, with a close and a distant language. Code-switching exposure was indexed from parental questionnaires. Results revealed that while monolinguals could detect both the close- and distant-language switch, bilinguals only reacted to the distant language, regardless of home code-switching experience. In the temporal dimension, the analyses showed that language switch detection required at least 10 s, suggesting that the audiovisual presentation (here the same speaker switching languages) slowed down or even hindered the language switch detection. These results suggest that the detection of a multimodal close-language switch is a challenging task, especially for bilingual infants exposed to phonologically and rhythmically close languages. The current research sets the ground for further studies exploring the role of indexical cues and selective attention processes on language switch detection.