Abstract

Speaker localization with microphone arrays has received significant attention in the past decade as a means for automated speaker tracking of individuals in a closed space for videoconferencing systems, directed speech capture systems, and surveillance systems. Traditional techniques are based on estimating the relative time difference of arrivals (TDOA) between different channels, by utilizing crosscorrelation function. As we show in the context of speaker localization, these estimates yield poor results, due to the joint effect of reverberation and the directivity of sound sources. In this paper, we present a novel method that utilizes a priori acoustic information of the monitored region, which makes it possible to localize directional sound sources by taking the effect of reverberation into account. The proposed method shows significant improvement of performance compared with traditional methods in "noise-free" condition. Further work is required to extend its capabilities to noisy environments.

Highlights

  • The inverse problem of localizing a source by using signal measurements at an array of sensors is a classical problem in signal processing, with applications in sonar, radar, and acoustic engineering

  • A novel time difference of arrivals (TDOA)-based sound source localization algorithm was presented which integrates a priori information of the acoustic environment for the localization of directional sound sources in reverberant environments

  • The algorithm utilizes the redundant information provided by multiple sensors to enhance the TDOA performance

Read more

Summary

INTRODUCTION

The inverse problem of localizing a source by using signal measurements at an array of sensors is a classical problem in signal processing, with applications in sonar, radar, and acoustic engineering. Many new ideas have been proposed to deal more effectively with noise and reverberation by taking advantage of the nature of a speech signal [14, 15] or by utilizing redundant information from multiple sensor pairs [11, 16,17,18] Another interesting approach is to utilize the impulse response functions from the source to the microphones. The first one is the high-resolution spectral estimation technique [2, 3] where the transfer functions are estimated blindly by an adaptive algorithm intended to find the eigenvalues of the cross-correlation matrix The more accurate this estimate is, the better the relative delay between the two microphone signals can be estimated. We consider the effect of source directivity on source localization performance; our system can more accurately localize nonisotropic sound sources (e.g., human sources) as well, without being limited by their orientation

THE ACOUSTIC MODEL
THE EFFECT OF THE ACOUSTIC ENVIRONMENT ON THE CROSS-CORRELATION FUNCTION
Effect of source directivity
AGGREGATE EFFECT OF THE ACOUSTIC ENVIRONMENT
SOLVING THE INVERSE PROBLEM
Finding the prestored configuration which fits observations best
EFFECT OF DISCRETIZATION
The test environment
Optimal level of considerable reverberation effect
Performance in noisy condition
Performance in different acoustic environment
Speed of convergence
Validity of the applied acoustic model
Findings
Computational requirement
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call