Abstract

When the auditory and visual components of spoken audiovisual nonsense syllables are mismatched, perceivers produce four different types of perceptual responses, auditory correct, visual correct, fusion (the so-called McGurk effect), and combination (i.e., two consonants are reported). Here, quantitative measures were developed to account for the distribution of the four types of perceptual responses to 384 different stimuli from four talkers. The measures included mutual information, correlations, and acoustic measures, all representing audiovisual stimulus relationships. In Experiment 1, open-set perceptual responses were obtained for acoustic /bɑ/ or /lɑ/ dubbed to video /bɑ, dɑ, gɑ, vɑ, zɑ, lɑ, wɑ, ðɑ/. The talker, the video syllable, and the acoustic syllable significantly influenced the type of response. In Experiment 2, the best predictors of response category proportions were a subset of the physical stimulus measures, with the variance accounted for in the perceptual response category proportions between 17% and 52%. That audiovisual stimulus relationships can account for perceptual response distributions supports the possibility that internal representations are based on modality-specific stimulus relationships.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call