Abstract

The coherent signal subspace method (CSSM) enables the direction-of-arrival (DoA) estimation of coherent sources with subspace localization methods. The focusing process that aligns the signal subspaces within a frequency band to its central frequency is central to the CSSM. Within current focusing approaches, a direction-independent focusing approach may be more suitable for reverberant environments since no initial estimation of the sources' DoAs is required. However, these methods use integrals over the steering function, and cannot be directly applied to arrays around complex scattering structures, such as robot heads. In this article, current direction-independent focusing methods are extended to arrays for which the steering function is available only for selected directions, typically in a numerical form. Spherical harmonics decomposition of the steering function is then employed to formulate several aspects of the focusing error. A case of two coherent sources is studied and guidelines for the selection of the frequency smoothing bandwidth are suggested. The performance of the proposed methods is then investigated for an array that is mounted on a robot head. The focusing process is integrated within the direct-path dominance (DPD) test method for speaker localization, originally designed for spherical arrays, extending its application to arrays with arbitrary configurations. Finally, experiments with real data verify the feasibility of the proposed method to successfully estimate the DoAs of multiple speakers under real-world conditions.

Highlights

  • D IRECTION-OF-ARRIVAL (DoA) estimation is an important and timely challenge in audio signal processing with applications in acoustic scene analysis, signal enhancement, and speech processing [1], [2]

  • One approach is tailored to spherical arrays, and employs plane-wave decomposition (PWD), which can be viewed as the application of direction-independent focusing and completely removes the frequency dependence of the steering matrices [21]

  • The array is composed of 12 omnidirectional microphones arranged in a pseudo-spherical arrangement. This array was employed in the recent acoustic sources LOCalization And TrAcking (LOCATA) challenge and a detailed description of the array can be found in the challenge documents [23]

Read more

Summary

INTRODUCTION

D IRECTION-OF-ARRIVAL (DoA) estimation is an important and timely challenge in audio signal processing with applications in acoustic scene analysis, signal enhancement, and speech processing [1], [2]. In contrast to the common focusing approach, the focusing methods proposed in [5]–[7] are direction-independent, and do not require initial DoA estimates nor an iterative process. This is achieved by formulating focusing matrices that minimize the mean square focusing error over all directions. Real recordings with an array mounted on a Nao robot head as part of the LOCalization And TrAcking (LOCATA) challenge [15], [16] were employed to evaluate and compare the performance of the proposed approach and the DPD test proposed in [14] This method was chosen for comparison because it is based on the DPD approach and can be applied to arbitrary arrays.

ARRAY MODEL AND FREQUENCY SMOOTHING
CURRENT APPROACHES TO FOCUSING
PROPOSED FOCUSING METHOD
FACTORS AFFECTING FOCUSING ERROR
FOCUSING ANALYSIS FOR A ROBOT HEAD
Spherical Harmonics Order Truncation
Matrix Inversion
Spatial Aliasing
Focusing Performance
SMOOTHING BANDWIDTH SELECTION
VIII. APPLICATION TO SPEAKER LOCALIZATION
EXPERIMENTAL VERIFICATION
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call