An Iterative Framework for Self-Supervised Deep Speaker Representation Learning

Danwei Cai,Ming Li,Weiqing Wang

doi:10.1109/icassp39728.2021.9414713

Abstract

In this paper, we propose an iterative framework for self-supervised speaker representation learning based on a deep neural network (DNN). The framework starts with training a self-supervision speaker embedding network by maximizing agreement between different segments within an utterance via a contrastive loss. Taking advantage of DNN’s ability to learn from data with label noise, we propose to cluster the speaker embedding obtained from the previous speaker network and use the subsequent class assignments as pseudo labels to train a new DNN. Moreover, we iteratively train the speaker network with pseudo labels generated from the previous step to bootstrap the discriminative power of a DNN. Speaker verification experiments are conducted on the VoxCeleb dataset. The results show that our proposed iterative self-supervised learning framework outperformed previous works using self-supervision. The speaker network after 5 iterations obtains a 61% performance gain over the speaker embedding model trained with contrastive loss.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Iterative Framework for Self-Supervised Deep Speaker Representation Learning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Incorporating Visual Information in Audio Based Self-Supervised Speaker Recognition
Danwei Cai ... Weiqing Wang
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 30
Danwei Cai, et. al.Danwei Cai ... Weiqing Wang
01 Jan 2021
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 30

Deep Margin-Sensitive Representation Learning for Cross-Domain Facial Expression Recognition
Yingjian Li ... Zheng Zhang
IEEE Transactions on Multimedia | VOL. 25
Yingjian Li, et. al.Yingjian Li ... Zheng Zhang
01 Jan 2023
IEEE Transactions on Multimedia | VOL. 25

A Deep Representation Learning Framework for Medical Imaging Data Analysis
Pengcheng Xi
-
Pengcheng XiPengcheng Xi
24 Jun 2020
24 Jun 2020

Graph Barlow Twins: A self-supervised representation learning framework for graphs
Piotr Bielak ... Nitesh V Chawla
Knowledge-Based Systems | VOL. 256
Piotr Bielak, et. al.Piotr Bielak ... Nitesh V Chawla
17 Aug 2022
Knowledge-Based Systems | VOL. 256

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Iterative Framework for Self-Supervised Deep Speaker Representation Learning

Abstract

Talk to us

Similar Papers