Abstract

When performing binaural spatialisation, it is widely accepted that the choice of the head related transfer functions (HRTFs), and in particular the use of individually measured ones, can have an impact on localisation accuracy, externalization, and overall realism. Yet the impact of HRTF choices on speech-in-noise performances in cocktail party-like scenarios has not been investigated in depth. This paper introduces a study where 22 participants were presented with a frontal speech target and two lateral maskers, spatialised using a set of non-individual HRTFs. Speech reception threshold (SRT) was measured for each HRTF. Furthermore, using the SRT predicted by an existing speech perception model, the measured values were compensated in the attempt to remove overall HRTF-specific benefits. Results show significant overall differences among the SRTs measured using different HRTFs, consistently with the results predicted by the model. Individual differences between participants related to their SRT performances using different HRTFs could also be found, but their significance was reduced after the compensation. The implications of these findings are relevant to several research areas related to spatial hearing and speech perception, suggesting that when testing speech-in-noise performances within binaurally rendered virtual environments, the choice of the HRTF for each individual should be carefully considered.

Highlights

  • The mean Speech reception threshold (SRT) was calculated across all participants, and it is shown in the right graph of Fig. 5, together with the 95% confidence interval (CI) for each head related transfer functions (HRTFs) used in the experiment

  • Raw SRTs showed a significant effect of HRTF on SRT when the HRTFA was included [Fð7; 168Þ 1⁄4 16:7861; p < 0:001] and when it was removed from the dataset [Fð6; 147Þ 1⁄4 6:3972; p < 0:001]

  • Considering the aims and hypotheses of the present study (H1 and H2), an argument could be made for quantifying this HRTF-related benefit and using it for compensating the SRT results. This would result in minimising the HRTFspecific differences, focusing the analysis on the monoaural spectral nature of the HRTFs, on the relationship between each HRTF and each subject, and, possibly, on the impact of cognitive processes when completing SRT tasks using different HRTFs

Read more

Summary

Introduction

Once a HRTF has been measured (or precisely estimated) for a given listener, immersive virtual reality (VR) audio systems can make use of it and process sounds so that, when presented over headphones, they are perceived as emanating from any position in the surrounding 3D space This technique is referred to as binaural spatialization (Hammershøi and Møller, 2005). Previous work demonstrates that some attentional processes use HRTF cues to support focusing auditory attention on a specific direction Related to this is the cocktail party effect (Cherry, 1953). At the beginning of the current century, Bronkhorst presented a review on the cocktail party problem (Bronkhorst, 2000), later revisited (Bronkhorst, 2015), and introduced models of binaural speech perception that allow estimation

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call