Abstract

Acoustic source localization (ASL) is a fundamental yet still challenging signal processing problem in sound acquisition, speech communication, and human–machine interfaces. Many ASL algorithms have been developed, such as the steered response power (SRP), the SRP-phase transform, the minimum variance distortionless response, the multiple signal classification (MUSIC), the householder transform-based methods, to name but a few. Most of those algorithms require hundreds or even thousands of snapshots to produce one reliable estimate, which make them difficult to track moving sources. Moreover, not much efforts have been reported in the literature to show the intrinsic relationships among those methods. This paper deals with the ASL problem with its focal point placed on how to achieve ASL with a short frame of acoustic signal (corresponding to a single snapshot in the frequency domain). It reformulates the ASL problem from the perspective of geometric projection. Four types of power functions are proposed, leading to several different algorithms for ASL. By analyzing those power functions, we show the equivalence between the popularly used conventional algorithms and our methods, which provides some new insights into the conventional algorithms. The relationships among different algorithms are discussed, which make it easy to comprehend the pros and cons of each of those methods. Experiments in real acoustic environments corroborate the theoretical analysis, which in turn justifies the contribution of this paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call