Abstract

In a multi-talker situation, listeners have the challenge of identifying a target speech source out of a mixture of interfering background noises. In the current study, it was investigated how listeners analyze audio-visual scenes with varying complexity in terms of number of talkers and reverberation. The visual information of the room was either congruent with the acoustic room or incongruent. The listeners' task was to locate an ongoing speech source in a mixture of other speech sources. The three-dimensional audio-visual scenarios were presented using a loudspeaker array and virtual reality glasses. It was shown that room reverberation, as well as the number of talkers in a scene, influence the ability to analyze an auditory scene in terms of accuracy and response time. Incongruent visual information of the room did not affect this ability. When few talkers were presented simultaneously, listeners were able to detect a target talker quickly and accurately even in adverse room acoustical conditions. Reverberation started to affect the response time when four or more talkers were presented. The number of talkers became a significant factor for five or more simultaneous talkers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call