Recent hearing research has benefitted from the latest Virtual Reality systems that allowed the reproduction of immersive Audio-Visual scenarios to achieve more ecological listening tests. Indeed, efforts have been spent to identify the aspects that convey actual ecological validity, particularly investigating the effects of visual cues and self-motion on Speech Intelligibility through tests mainly based on simulated scenes. However, work must still be addressed when sceneries developed through real recordings inside reverberant environments are concerned. This study used 3rd-order ambisonics recordings and stereoscopic 360° videos inside a reverberant conference hall to create three virtual audio-visual scenes where speech intelligibility tests were performed, introducing informational noise from different angles. A 16-speaker spherical array synced with a headmounted display was used to administer the immersive tests to 50 normal-hearing subjects. Firstly, tests only composed of the auditory scenes were compared, based on the achieved scores, with tests also providing contextual and positional source-related visual cues, both with and without self-motion, for a total of four different test configurations. Then, to complete the investigation of the visual cues' impact on speech intelligibility, ten normal-hearing subjects were recruited to performaudio-visual tests incorporating lip-sync-related visual cues for the target speech.
Read full abstract