Abstract

Immersion perception plays a critical role in the tremendous success of the recent development of augment/virtual reality applications, in which high-quality spatial audio is mandatory. However, because of the high individuality of numerous anthropometric parameters in connection with listeners, deriving the proper acoustic perturbation characteristics in the process of producing immersive spatial audio via loudspeakers, in which speed and precision are both important, has long been a research challenge. This study first adopts gain vectors for loudspeakers (GVL) to represent the acoustic perturbations, which are sensitive to both the frequency bands and the anthropometric parameters of an individual. The radial base function neural network then maps the parameter sets to the corresponding GVLs. A parallel convolution algorithm guides the GVLs to convolve with the source signals, and the outputs drive the loudspeakers to produce the designated spatial audio of high individuality. Experimental results indicate the following: (1) the binaural cues deviation decrease by 12.21% on average, and the subjective score of the listener increases by 27.24%, and (2) the ratio of time consumed by parallel convolution based on six threads to a general convolution is 0.373, demonstrating that immersive spatial audio of high individuality can be produced in real time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call