A Novel Method of Language Modeling for Automatic Captioning in TC Video Teleconferencing

Xiaojia Zhang,Laura Schopp,Yunxin Zhao

doi:10.1109/titb.2006.885549

Abstract

We are developing an automatic captioning system for teleconsultation video teleconferencing (TC-VTC) in telemedicine, based on large vocabulary conversational speech recognition. In TC-VTC, doctors' speech contains a large number of infrequently used medical terms in spontaneous styles. Due to insufficiency of data, we adopted mixture language modeling, with models trained from several datasets of medical and nonmedical domains. This paper proposes novel modeling and estimation methods for the mixture language model (LM). Component LMs are trained from individual datasets, with class n-gram LMs trained from in-domain datasets and word n-gram LMs trained from out-of-domain datasets, and they are interpolated into a mixture LM. For class LMs, semantic categories are used for class definition on medical terms, names, and digits. The interpolation weights of a mixture LM are estimated by a greedy algorithm of forward weight adjustment (FWA). The proposed mixing of in-domain class LMs and out-of-domain word LMs, the semantic definitions of word classes, as well as the weight-estimation algorithm of FWA are effective on the TC-VTC task. As compared with using mixtures of word LMs with weights estimated by the conventional expectation-maximization algorithm, the proposed methods led to a 21% reduction of perplexity on test sets of five doctors, which translated into improvements of captioning accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Novel Method of Language Modeling for Automatic Captioning in TC Video Teleconferencing

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Information Technology in Biomedicine

Lead the way for us

Journal: IEEE Transactions on Information Technology in Biomedicine	Publication Date: May 1, 2007
Citations: 16

Similar Papers

Hybrid word/Part-of-Arabic-Word Language Models for arabic text document recognition
Mohamed Faouzi Benzeghiba ... Jerome Louradour
-
Mohamed Faouzi Benzeghiba, et. al.Mohamed Faouzi Benzeghiba ... Jerome Louradour
01 Aug 2015
01 Aug 2015

Using semantic analysis to improve speech recognition performance
Hakan Erdogan ... Michael Picheny
Computer Speech & Language | VOL. 19
Hakan Erdogan, et. al.Hakan Erdogan ... Michael Picheny
23 Nov 2004
Computer Speech & Language | VOL. 19

Recent experiments in large vocabulary conversational speech recognition
J Billa ... S Marsoukas
-
J Billa, et. al.J Billa ... S Marsoukas
01 Jan 1998
01 Jan 1998

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition
William Chan ... Oriol Vinyals
-
William Chan, et. al.William Chan ... Oriol Vinyals
01 Mar 2016
01 Mar 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Method of Language Modeling for Automatic Captioning in TC Video Teleconferencing

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Information Technology in Biomedicine