A Multitask Learning Framework for Speaker Change Detection with Content Information from Unsupervised Speech Decomposition

Hang Su,Minglei Li,Helen Meng,Xunying Liu,Long Dang,Danyang Zhao,Xixin Wu

doi:10.1109/icassp43922.2022.9746116

Abstract

Speaker Change Detection (SCD) is a task of determining the time boundaries between speech segments of different speakers. SCD system can be applied to many tasks, such as speaker diarization, speaker tracking, and transcribing audio with multiple speakers. Recent advancements in deep learning lead to approaches that can directly detect the speaker change points from audio data at the frame-level based on neural network models. These approaches may be further improved by utilizing speaker information in the training data, and utilizing content information extracted in an unsupervised manner. This work proposes a novel framework for the SCD task, which utilizes a multitask learning architecture to leverage speaker information during the training stage, and adds the content information extracted from an unsupervised speech decomposition model to help detect the speaker change points. Experiment results show that the architecture of multitask learning with speaker information can improve the performance of SCD, and adding content information extracted from unsupervised speech decomposition model can further improve the performance. To the best of our knowledge, this work outperforms the state-of-the-art SCD results [1] on the AMI dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Multitask Learning Framework for Speaker Change Detection with Content Information from Unsupervised Speech Decomposition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speaker Change Detection For Transformer Transducer ASR
Jian Wu ... Zhuo Chen
-
Jian Wu, et. al.Jian Wu ... Zhuo Chen
04 Jun 2023
04 Jun 2023

Data Augmentation and D-vector Representation Methods for Speaker Change Detection
Jisu Park ... Jeon Gue Park
-
Jisu Park, et. al.Jisu Park ... Jeon Gue Park
13 Oct 2020
13 Oct 2020

Speaker Change Detection Using Binary Key Modelling with Contextual Information
Jose Patino ... Héctor Delgado
-
Jose Patino, et. al.Jose Patino ... Héctor Delgado
01 Jan 2017
01 Jan 2017

Augmenting Transformer-Transducer Based Speaker Change Detection with Token-Level Training Loss
Guanlong Zhao ... Yiling Huang
-
Guanlong Zhao, et. al.Guanlong Zhao ... Yiling Huang
04 Jun 2023
04 Jun 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Multitask Learning Framework for Speaker Change Detection with Content Information from Unsupervised Speech Decomposition

Abstract

Talk to us

Similar Papers