Abstract

An analysis of the relationship between the bandwidth of acoustic signals and the required resolution of steered-response power phase transform (SRP-PHAT) maps used for sound source localization is presented. This relationship does not rely on the far-field assumption, nor does it depend on any specific array topology. The proposed analysis considers the computation of a SRP map as a process of sampling a set of generalized cross-correlation (GCC) functions, each one corresponding to a different microphone pair. From this approach, we derive a rule that relates GCC bandwidth with inter-microphone distance, resolution of the SRP map, and the potential position of the sound source relative to the array position. This rule is a sufficient condition for an aliasing-free calculation of the specified SRP-PHAT map. Simulation results show that limiting the bandwidth of the GCC according to such rule leads to significant reductions in sound source localization errors when sources are not in the immediate vicinity of the microphone array. These error reductions are more relevant for coarser resolutions of the SRP map, and they happen in both anechoic and reverberant environments.

Highlights

  • S OUND source localization based on steered-response power (SRP) maps computed using the generalized cross-correlation (GCC) function with phase transform (PHAT), i.e. SRP-PHAT, has been reported to perform robustly against noise and, especially, reverberation [1], [2]

  • The proposed analysis considers the computation of a SRP map as a process of sampling a set of generalized cross-correlation (GCC) functions, each one corresponding to a different microphone pair

  • In order to avoid such errors several approaches have been proposed so far, such as stochastic region contraction (SRC), which involves performing a stochastic search of the highest peaks in the SRP map [12] before decreasing map resolution and reducing map extent; calculating the integral of the GCCPHAT along an interval of time delay values defined by the position of each grid point and the spatial resolution of the map [13]; or designing the map grid considering the specific geometry of the microphone array [14]

Read more

Summary

INTRODUCTION

S OUND source localization based on steered-response power (SRP) maps computed using the generalized cross-correlation (GCC) function with phase transform (PHAT), i.e. SRP-PHAT, has been reported to perform robustly against noise and, especially, reverberation [1], [2]. In order to avoid such errors several approaches have been proposed so far, such as stochastic region contraction (SRC), which involves performing a stochastic search of the highest peaks in the SRP map [12] before decreasing map resolution and reducing map extent; calculating the integral of the GCCPHAT along an interval of time delay values defined by the position of each grid point and the spatial resolution of the map [13]; or designing the map grid considering the specific geometry of the microphone array [14]. The rule can be applied to hierarchical searches at every resolution level to avoid the emergence of spurious maxima at the corresponding SRP maps, achieving lower errors in sound source localization estimates It provides an alternative interpretation, based on basic signal processing theory, of algorithms involving GCC integration [13], design of map grids with reduced resolution in certain areas [14], or adjustment of grid resolution as a function of signal bandwidth [4].

PROBLEM STATEMENT
IMPLICATIONS FOR THE CALCULATION OF THE SRP MAP
CALCULATION OF SRP MAPS WITH VARIABLE GCC BANDWIDTH
EXPERIMENTS AND RESULTS
RESULTS
DISCUSSION AND CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call