Language Models For Speech Recognition Research Articles

This article examines the use of statistically discovered morpheme-like units for Spoken Document Retrieval (SDR). The morpheme-like units ( morphs ) are used both for language modeling in speech recognition and as index terms. Traditional word-based methods suffer from out-of-vocabulary words. If a word is not in the recognizer vocabulary, any occurrence of the word in speech will be missing from the transcripts. The problem is especially severe for languages with a high number of distinct word forms such as Finnish. With the morph language model, even previously unseen words can be recognized by identifying its component morphs. Similarly in information retrieval queries, complex word forms, even unseen ones, can be matched to data after segmenting them to morphs. Retrieval performance can be further improved by expanding the transcripts with alternative recognition results from confusion networks . In this article, a novel retrieval evaluation corpus consisting of unsegmented Finnish radio programs, 25 queries and corresponding human relevance assessments was constructed. Previous results on using morphs and confusion networks for Finnish SDR are confirmed and extended to the unsegmented case. As previously, using morphs or base forms as index terms yields about equal performance but combination methods, including a new one, are found to work better than either alone. Using alternative morph segmentations of the query words is found to further improve the results. Lexical similarity-based story segmentation was applied and performance using morphs, base forms, and their combinations was compared for the first time.

Read full abstract

Bayesian Belief Networks are a powerful tool for combining different knowledge sources with various degrees of uncertainty in a mathematical sound and computationally efficient way. Surprisingly they have not yet found their way into the speech processing field, despite the fact that in this science multiple unreliable information sources exist. The present paper shows how the theory can be utilized in for language modeling. After providing an introduction to the theory of Bayesian Networks, we develop several extensions to the classic theory by describing mechanisms for dealing with statistical dependence among daughter nodes (usually assumed to be conditionally independent) and by providing a learning algorithm based on the EM-algorithm with which the probabilities of link matrices can be learned from example data. Using these extensions a language model for speech recognition based on a context-free framework is constructed. In this model, sentences are not parsed in their entirety, as is usual with grammatical description, but only “locally” on suitably located segments. The model was evaluated over a text data base. In terms of test set entropy the model performed at least as good as the bi/tri-gram models, while showing a good ability to generalize from training to test data.

Read full abstract

Language Models For Speech Recognition Research Articles

Related Topics

Articles published on Language Models For Speech Recognition

Speech retrieval from unsegmented finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval

A unified language model for large vocabulary continuous speech recognition of Turkish

A statistical language model for conventional speech reflecting the previous utterance of the other participant

Multiclass composite N‐gram language model based on connection direction

Probabilistic Top-Down Parsing and Language Modeling

Structured language modeling

Classifications de mots non étiquetés par des méthodes statistiques

A task adaptation method and use of idiomatic expression of stochastic language model for speech recognition

BALANCING STOCHASTIC KNOWLEDGE ON ACOUSTICS AND LINGUISTICS

Dynamic language model for speech recognition

Bayesian Belief Networks as a tool for stochastic parsing

Analysing a simple language model·some general conclusions for language models for speech recognition

On the bias of the Turing-Good estimate of probabilities

Computation of the probability of initial substring generation by stochastic context-free grammars

HMM speech recognition using stochastic language models.

A cache-based natural language model for speech recognition

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Language Models For Speech Recognition Research Articles

Related Topics

Articles published on Language Models For Speech Recognition

Speech retrieval from unsegmented finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval

A unified language model for large vocabulary continuous speech recognition of Turkish

A statistical language model for conventional speech reflecting the previous utterance of the other participant

Multiclass composite N‐gram language model based on connection direction

Probabilistic Top-Down Parsing and Language Modeling

Structured language modeling

Classifications de mots non étiquetés par des méthodes statistiques

A task adaptation method and use of idiomatic expression of stochastic language model for speech recognition

BALANCING STOCHASTIC KNOWLEDGE ON ACOUSTICS AND LINGUISTICS

Dynamic language model for speech recognition

Bayesian Belief Networks as a tool for stochastic parsing

Analysing a simple language model·some general conclusions for language models for speech recognition

On the bias of the Turing-Good estimate of probabilities

Computation of the probability of initial substring generation by stochastic context-free grammars

HMM speech recognition using stochastic language models.

A cache-based natural language model for speech recognition