Abstract

This paper presents a novel filtering approach for tracking multiple concurrent speakers with a microphone array. In this framework, a Kalman filter bank that evolves in time according to a temporal Hidden Markov Model (HMM) is proposed. This approach was designed to overcome two major problems that occur in spontaneous speech; namely, 1) the speaker overlap. This problem is solved using a bank of parallel Kalman filters that track multiple simultaneous speakers, and 2) the high discontinuity of spontaneous speech caused by short breaks and silences. This is solved using an HMM that allows speakers to change their state (speaking, silent, etc.) over time. The actual active speakers number and locations are extracted from the active filters using a second Kalman filter. Experiments on the AV16.3 showed an average tracking rate improvement of 8% compared to a short-term clustering approach, while being 7 times faster.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call