Abstract

The increasing importance of spatial audio technologies has demonstrated the need and importance of correctly adapting to the individual characteristics of the human auditory system, and illustrates the crucial need for humanoid localization systems for testing these technologies. To this end, this paper introduces a novel feature analysis and selection approach for binaural localization and builds a probabilistic localization mapping model, especially useful for the vertical dimension localization. The approach uses the mutual information as a metric to evaluate the most significant frequencies of the interaural phase difference and interaural level difference. Then, by using the random forest algorithm and embedding the mutual information as a feature selection criteria, the feature selection procedures are encoded with the training of the localization mapping. The trained mapping model is capable of using interaural features more efficiently, and, because of the multiple-tree-based model structure, the localization model shows robust performance to noise and interference. By integrating the direct path relative transfer function estimation, we propose to devise a novel localization approach that has improved performance in the presence of noise and reverberation. The proposed mapping model is compared with the state-of-the-art manifold learning procedure in different acoustical configurations, and a more accurate and robust output can be observed.

Highlights

  • Spatial audio technologies has shown its importance in various fields, such as video conferencing, virtual reality, humanoid robot interactions and hearing aids, and there are many existing methods reproducing a spatial sound field on different devices for human listeners

  • By integrating the direct path relative transfer function estimation, we propose to devise a novel localization approach that has improved performance in the presence of noise and reverberation

  • Much behavioural and psychoacoustic evidence has confirmed that two individualized interaural cues, Interaural Time Difference (ITD) and Interaural Level Difference (ILD), play an essential role in localizing a sound source

Read more

Summary

Introduction

Spatial audio technologies has shown its importance in various fields, such as video conferencing, virtual reality, humanoid robot interactions and hearing aids, and there are many existing methods reproducing a spatial sound field on different devices for human listeners. Weng et al adopted a non-parametric tree-based learning method to retrieve the mapping between the interaural cues and source locations with fewer restrictions on its spatiotemporal characteristics and environment structure [19]. Deleforge et al proved the elevation ambiguities of Interaural Phase Difference (IPD) at the high-frequency range (between 2 to 8 kHz) and used a feature space concatenated by full-spectrum ILD and low-frequency IPD [17]. Another challenge in sound source localization is localizing the source in the presence of noise and reverberation. Model proposed in [17] in different noise and reverberation environments without prior knowledge of the acoustical conditions

Individualized Feature Selection Using Mutual Information
Mutual Information Computation
Analysis of Mutual Information in Interaural Cues
Spatial Feature Learning and Selected Feature Vector
Result
Probabilistic Localization Model and System Design
Feature Dependency Analysis and Assembled Data Partition Model
Data Partition and Tree-Structured Model
Random Forest Bagging and Unbiased Probability Estimation
Model Training and Parameter Selection
Trained Model Interpretation
Experiments With Simulated Data
Simulation Configuration
Performance Impact of the Feature Vector Length
Localization Performance
Performance Measurements and Simulation Configuration
Localization Performance with Different Training Environment
Localization Performance with Additive Noise
Localization Performance with Reverberations
Experiment in Laboratory Environment
Experiment Facility and Room Configurations
Testing Positions and Microphone Data Pre-Processing
Experiment Result
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call