Abstract

Loudspeaker-based virtual sound environments provide a valuable tool for studying speech perception in realistic, but controllable and reproducible acoustic environments. The evaluation of different loudspeaker reproduction methods with respect to perceptual measures has been rather limited. This study focused on comparing speech intelligibility as measured in a reverberant reference room with virtual versions of that room. Two reproduction methods were based on room acoustic simulations, presented either using mixed-order ambisonics or nearest loudspeaker mapping playback. The third method utilized impulse responses measured with a spherical microphone array and mixed-order ambisonics. Three factors that affect speech intelligibility were varied: reverberation, the spatial configuration and the type of the interferers (speech or noise). Two interferers were placed either colocated with the target, or were symmetrically or asymmetrically separated. The results showed differences between the reference room and the simulation-based reproductions when the target and the interferers were spatially separated but not when they were colocated. The reproduction utilizing the microphone array was most similar to the reference room in terms of measured speech intelligibility. Differences in speech intelligibility could be accounted for using a binaural speech intelligibility model which considers better-ear signal-to-noise ratio differences and binaural unmasking effects. Thus, auditory modeling might be a fast and efficient way to evaluate virtual sound environments.

Highlights

  • One of the challenges in hearing research is to understand the mechanisms involved in speech perception in complex acoustic scenarios, such as in a restaurant or at a social gathering, commonly referred to as a “cocktail-party” scenario (Bronkhorst, 2000; Cherry, 1953)

  • Koski et al (2013) compared speech reception thresholds (SRTs) in a multi-talker scenario measured in a reference room, with corresponding SRTs measured in virtual room reproductions using microphone array recordings and directional audio coding (Pulkki, 2007)

  • The gray shaded area represents just-noticeable differences (JNDs) for the results obtained in the reference room

Read more

Summary

Introduction

One of the challenges in hearing research is to understand the mechanisms involved in speech perception in complex acoustic scenarios, such as in a restaurant or at a social gathering, commonly referred to as a “cocktail-party” scenario (Bronkhorst, 2000; Cherry, 1953). Loudspeaker-based VSEs can reproduce acoustic scenes in a laboratory to investigate how the auditory system functions in realistic listening scenarios. Using such a system, Koski et al (2013) compared speech reception thresholds (SRTs) in a multi-talker scenario measured in a reference room, with corresponding SRTs measured in virtual room reproductions using microphone array recordings and directional audio coding (Pulkki, 2007).

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.