Abstract

This paper presents a system of speaker localization for a purpose of speaker tracking by camera. The authors use the information given by the two microphones, placed in opposition, to determine the position of the active speaker in trying to supervise the audio-visual recording. To achieve the speaker localization task, the authors have proposed and employed two methods, which are called respectively: the filtered correlation method and the energy differential method. The principle of the first method is based on the calculation of the correlation between the two signals collected by the two microphones and a special filtering. The second is based on the computation of the logarithmic energy differential between these two signals. However, when different methods are used simultaneously to make a decision, it is often interesting to use a fusion technique combining those estimations or decisions in order to enhance the system performances. For that purpose, this paper proposes two fusion techniques operating at the decision level which are used to fuse the two estimations into one that should be more precise.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.