Abstract

Speaker localization is a technique to locate and track an active speaker from multiple acoustic sources using microphone array. Microphone array is used to improve the speech quality of recorded speech signal in meeting room and other places. In this work, the time delay estimation between source and each microphone is calculated using a localization method called time differences of arrival (TDOA). TDOA localization consists of two steps namely (a) a time delay estimator and (b) a localization estimator. For time delay estimation, the generalized cross-correlation using phase transform, the generalized cross correlation using maximum likelihood, linear prediction (LP) residual and the Hilbert envelope of the LP residual are chosen for estimating the location of a person. A new speaker localization algorithm known as group search optimization (GSO) algorithm is proposed. The performance of this algorithm is analyzed and compared with Gauss–Newton nonlinear least square method and genetic algorithm. Experimental results show that the proposed GSO method outperforms the other methods in terms of mean square error, root mean square error, mean absolute error, mean absolute percentage error, euclidean distance and mean absolute relative error.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call