Abstract
In this work, we propose an algorithm for acoustic source localization based on time delay of arrival (TDOA) estimation. In earlier work by other authors, an initial closed-form approximation was first used to estimate the true position of the speaker followed by a Kalman filtering stage to smooth the time series of estimates. In the proposed algorithm, this closed-form approximation is eliminated by employing a Kalman filter to directly update the speaker's position estimate based on the observed TDOAs. In particular, the TDOAs comprise the observation associated with an extended Kalman filter whose state corresponds to the speaker's position. We tested our algorithm on a data set consisting of seminars held by actual speakers. Our experiments revealed that the proposed algorithm provides source localization accuracy superior to the standard spherical and linear intersection techniques. Moreover, the proposed algorithm, although relying on an iterative optimization scheme, proved efficient enough for real-time operation.
Highlights
Most practical acoustic source localization schemes are based on time delay of arrival estimation (TDOA) for the following reasons: such systems are conceptually simple
(2) For a given source location, the squared error is calculated between the estimated TDOAs and those determined from the source location
If the TDOA estimates are assumed to have a Gaussiandistributed error term, it can be shown that the least-squares metric used in Step (2) provides the maximum likelihood (ML) estimate of the speaker location [2]
Summary
Most practical acoustic source localization schemes are based on time delay of arrival estimation (TDOA) for the following reasons: such systems are conceptually simple. Brandstein et al [4] proposed yet another closed-form approximation known as linear intersection Their algorithm proceeds by first calculating a bearing line to the source for each pair of sensors. As shown here, the nonlinearity seen in the acoustic source localization problem is relatively mild and can be adequately handled by performing several local iterations for each time step as explained in [14] Such theoretical considerations, notwithstanding, the question of whether Kalman or particle filters are better suited for speaker tracking, will only be answered by empirical studies. We present a numerically stable implementation of the Kalman filtering algorithms discussed in this work that is based on the Cholesky decomposition
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have