Multichannel Overlapping Speaker Segmentation Using Multiple Hypothesis Tracking Of Acoustic And Spatial Features

Aidan O T Hogg,Christine Evers,Patrick A Naylor

doi:10.1109/icassp39728.2021.9414130

Abstract

An essential part of any diarization system is the task of speaker segmentation which is important for many applications including speaker indexing and automatic speech recognition (ASR) in multi-speaker environments. Segmentation of overlapping speech has recently been a key focus of this work. In this paper we explore the use of a new multimodal approach for overlapping speaker segmentation that tracks both the fundamental frequency (F <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</inf> ) of the speaker and the speaker’s direction of arrival (DOA) simultaneously. Our proposed multiple hypothesis tracking system, which simultaneously tracks both features, shows an improvement in segmentation performance when compared to tracking these features separately. An illustrative example of overlapping speech demonstrates the effectiveness of our proposed system. We also undertake a statistical analysis on 12 meetings from the AMI corpus and show an improvement in the HIT rate of 14.1% on average against a commonly used deep learning bidirectional long short term memory network (BLSTM) approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multichannel Overlapping Speaker Segmentation Using Multiple Hypothesis Tracking Of Acoustic And Spatial Features

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Multiple Hypothesis Tracking for Overlapping Speaker Segmentation
Aidan O T Hogg ... Christine Evers
-
Aidan O T Hogg, et. al.Aidan O T Hogg ... Christine Evers
01 Oct 2019
01 Oct 2019

Multiple Maneuvering Target Tracking Using MHT and Nonlinear Non-Gaussian Kalman Filter
P Muthumanikandan ... V Vaidehi
-
P Muthumanikandan, et. al.P Muthumanikandan ... V Vaidehi
01 Jan 2008
01 Jan 2008

Overlapping Speaker Segmentation Using Multiple Hypothesis Tracking of Fundamental Frequency
Aidan O T Hogg ... Christine Evers
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 29
Aidan O T Hogg, et. al.Aidan O T Hogg ... Christine Evers
01 Jan 2020
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 29

<title>Continuous time representation of multiple hypothesis track data</title>
Samuel S Blackman ... Oliver E Drummond
-
Samuel S Blackman, et. al.Samuel S Blackman ... Oliver E Drummond
22 Oct 1993
22 Oct 1993

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multichannel Overlapping Speaker Segmentation Using Multiple Hypothesis Tracking Of Acoustic And Spatial Features

Abstract

Talk to us

Similar Papers