Improvement of sector based multiple speaker localization in a smart room

M Hesam,H Marvi

doi:10.1109/icosp.2010.5656145

Abstract

Recent advances in computer technology and speech processing and the interest on human-machine communication have made possible development of hands-free speech application with microphone array in smart room environments. One of the most important tasks in a smart room is localization of multi-speaker that permits a wide spectrum of application. Combined of hyperbolae produced by time delay estimation (TDE) between several microphones pair utilizes for source localization. In this paper, by using the TDE combination based on multiplication of spatial likelihood function (SLFs) generated from each microphone pair and the head orientation information, a new acoustic multi-speaker localization function has been proposed that we call it OPROD-PHAT. For the search space reduction divided the space of meeting room into a few sections, and for each time frame, we estimate the average OPROD-PHAT function output power within a volume of section, and by using a new two step adaptive threshold, we determined much better which sections contain active speaker. Finally we also implemented a closed-form TDOA based localization approaches for each active section. Has been shown it is a way to apply single speaker TDOA method to a multi-speaker problem. The result of simulation show superior performance of proposed system.

Full Text