Abstract

ABSTRACT This paper describes some recent experiments on unsuper-vised language model adaptation for transcription of broadcastnews data. In previous work, a framework for automaticallyselecting adaptation data using information retrieval techniqueswas proposed. This work extends the method and presents ex-perimental results with unsupervised language model adapta-tion. Threeprimaryaspectsareconsidered: (1)theperformanceof5widelyusedLMadaptationmethodsusingthesameadapta-tion data is compared; (2) the influence of the temporal distancebetween the training and test data epoch on the adaptation effi-ciency is assessed; and (3) show-based language model adapta-tion is compared with story-based language model adaptation.Experimentshavebeencarriedoutforbroadcastnewstranscrip-tion in English and Mandarin Chinese. A relative word errorrate reduction of 4.7% was obtained in English and a 5.6% rela-tive character error rate reduction in Mandarin withstory-basedMDI adaptation. 1. INTRODUCTION While n-gram models are successfully used in speech recog-nition, their performance is influenced by any mismatch be-tween thetraining and testdata [7]. Theidea of language model(LM) adaptation is to use a small amount of domain specificdata to adjust the LM to reduce the impact of linguistic differ-ences between the training and testing data. Different schemesforLMadaptation have been proposed, such asthecache modelbased on the observation that a word whichoccurred in a recenttext has a higher probability to be seen again [9]; the triggermodel which uses a trigger word pair to get at semantic infor-mation [10]; and structured LMs [1].Broadcast news (BN) transcription is a complicated task forboth acoustic and language modeling. The linguistic attributesof BN data are complex, arising from the many different speak-ing styles, from spontaneous conversation to prepared speech(close in style to written texts). The content of BN data is openand any given BN show covers multiple topics.As a consequence, it is difficult to predict the topics of a BNshow without looking at the data itself. The only informationthat is available for the show are the hypotheses output from thespeech recognizer. However, for any given broadcast, the num-ber of words in the hypothesized transcript is quite small andcontains recognition errors. Therefore the transcripts are notsufficient for use as an adaptive corpus. Information retrieval(IR) methods provide a means to address this problem. Insteadof directly using the ASR hypotheses for LM adaptation, theycan be used as queries to an IR system in order to select ad-ditional on-topic adaptation data from a large general corpus.This approach reduces the effect of transcription errors in thehypotheses and at the same time provides substantially moretextual data for LM estimation.In this paper, a series of experiments are presented exploringthe general framework of unsupervised LM adaptation using IRmethods [3]. The performances of a variety of popular tech-niques for LM adaptation using automatically selected adapta-tion data are compared. The investigated techniques are linearinterpolation,maximum aposteriori(MAP)adaptation, mixturemodels, dynamic mixture models, and minimum discriminationinformation (MDI) adaptation. The effect of the temporal dis-tance between the epoch of the adaptation corpus and of theepoch of the test data is also assessed. As mentioned above, agiven BN show typically covers several stories, with each storybeing related to a different topic. To address the changing prop-erty of BN data, static and dynamic models for LM adaptationare investigated. In static modeling the LM is updated oncefor the whole show, which means that the LM must be simul-taneously fit to multiple topics. Dynamic modeling updates theLM at each automatically detected story change, which entailsestimating multiplestory-based LMs for each BN show. Exper-iments arecarriedout forBNtranscriptioninAmericanEnglishand Mandarin Chinese.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.