Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

Yerbolat Khassanov,Haihua Xu,Chongjia Ni,Bin Ma,Zhiping Zeng,Van Tung Pham,Eng Siong Chng

doi:10.21437/interspeech.2019-1867

Yerbolat Khassanov, Haihua Xu + Show 5 more

Open Access

https://doi.org/10.21437/interspeech.2019-1867

Copy DOI

Abstract

The lack of code-switch training data is one of the major concerns in the development of end-to-end code-switching automatic speech recognition (ASR) models. In this work, we propose a method to train an improved end-to-end code-switching ASR using only monolingual data. Our method encourages the distributions of output token embeddings of monolingual languages to be similar, and hence, promotes the ASR model to easily code-switch between languages. Specifically, we propose to use Jensen-Shannon divergence and cosine distance based constraints. The former will enforce output embeddings of monolingual languages to possess similar distributions, while the later simply brings the centroids of two distributions to be close to each other. Experimental results demonstrate high effectiveness of the proposed method, yielding up to 4.5% absolute mixed error rate improvement on Mandarin-English code-switching ASR task.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Monolingual Data Selection Analysis for English-Mandarin Hybrid Code-Switching Speech Recognition
Haobo Zhang ... Eng Siong Chng
-
Haobo Zhang, et. al.Haobo Zhang ... Eng Siong Chng
25 Oct 2020
25 Oct 2020

CECOS: A Chinese-English code-switching speech database
Han-Ping Shen ... Yan-Ting Yang
-
Han-Ping Shen, et. al.Han-Ping Shen ... Yan-Ting Yang
01 Oct 2011
01 Oct 2011

TALCS: An open-source Mandarin-English code-switching corpus and a speech recognition baseline
Chengfei Li ... Guangjing Wang
-
Chengfei Li, et. al.Chengfei Li ... Guangjing Wang
18 Sep 2022
18 Sep 2022

Improving N-Best Rescoring in Under-Resourced Code-Switched Speech Recognition Using Pretraining and Data Augmentation
Joshua Jansen Van Vüren ... Thomas Niesler
Languages | VOL. 7
Joshua Jansen Van Vüren, et. al.Joshua Jansen Van Vüren ... Thomas Niesler
13 Sep 2022
Languages | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

Abstract

Talk to us

Similar Papers