Lithuanian Broadcast Speech Transcription Using Semi-supervised Acoustic Model Training

Rasa Lileikytė,Arseniy Gorin,Lori Lamel,Jean-Luc Gauvain,Thiago Fraga-Silva

doi:10.1016/j.procs.2016.04.037

Rasa Lileikytė, Arseniy Gorin + Show 3 more

Open Access

https://doi.org/10.1016/j.procs.2016.04.037

Copy DOI

Abstract

This paper reports on an experimental work to build a speech transcription system for Lithuanian broadcast data, relying on unsupervised and semi-supervised training methods as well as on other low-knowledge methods to compensate for missing resources. Unsupervised acoustic model training is investigated using 360hours of untranscribed speech data. A graphemic pronunciation approach is used to simplify the pronunciation model generation and there-fore ease the language model adaptation for the system users. Discriminative training on top of semi-supervised training is also investigated, as well as various types of acoustic features and their combinations. Experimental results are provided for each of our development steps as well as contrastive results comparing various options. Using the best system configuration a word error rate of 18.3% is obtained on a set of development data from the Quaero program.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Procedia Computer Science	Publication Date: Jan 1, 2016
Citations: 11	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Lithuanian Broadcast Speech Transcription Using Semi-supervised Acoustic Model Training

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science

Lead the way for us

Similar Papers

Semi-supervised training strategies for deep neural networks
Matthew Gibson ... Gary Cook
-
Matthew Gibson, et. al.Matthew Gibson ... Gary Cook
01 Dec 2017
01 Dec 2017

Semi-Supervised Speech Recognition Acoustic Model Training Using Policy Gradient
Hoon Chung ... Jeon Gue Park
Applied Sciences | VOL. 10
Hoon Chung, et. al.Hoon Chung ... Jeon Gue Park
20 May 2020
Applied Sciences | VOL. 10

Unsupervised acoustic and language model training with small amounts of labelled data
Scott Novotney ... Richard Schwartz
-
Scott Novotney, et. al.Scott Novotney ... Richard Schwartz
01 Apr 2009
01 Apr 2009

Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration
Yan Huang ... Dong Yu
-
Yan Huang, et. al.Yan Huang ... Dong Yu
25 Aug 2013
25 Aug 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Lithuanian Broadcast Speech Transcription Using Semi-supervised Acoustic Model Training

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science