Automatic Speech Recognition Models Research Articles

The design of Spoken Dialog Systems cannot be considered as the simple combination of speech processing technologies. Indeed, speech-based interface design has been an expert job for a long time. It necessitates good skills in speech technologies and low-level programming. Moreover, rapid development and reusability of previously designed systems remains uneasy. This makes optimality and objective evaluation of design very difficult. The design process is therefore a cyclic process composed of prototype releases, user satisfaction surveys, bug reports and refinements. It is well known that human intervention for testing is time-consuming and above all very expensive. This is one of the reasons for the recent interest in dialog simulation for evaluation as well as for design automation and optimization. In this paper we expose a probabilistic framework for a realistic simulation of spoken dialogs in which the major components of a dialog system are modeled and parameterized thanks to independent data or expert knowledge. Especially, an Automatic Speech Recognition (ASR) system model and a User Model (UM) have been developed. The ASR model, based on articulatory similarities in language models, provides task-adaptive performance prediction and Confidence Level (CL) distribution estimation. The user model relies on the Bayesian Networks (BN) paradigm and is used both for user behavior modeling and Natural Language Understanding (NLU) modeling. The complete simulation framework has been used to train a reinforcement-learning agent on two different tasks. These experiments helped to point out several potentially problematic dialog scenarios.

Read full abstract

In spite of difficulty in defining the syllable unequivocally, and controversy over its role in theories of spoken and written language processing, the syllable is a potentially useful unit in several practical tasks which arise in computational linguistics and speech technology. For instance, syllable structure might embody valuable information for building word models in automatic speech recognition, and concatenative speech synthesis might use syllables or demisyllables as basic units. In this paper, we first present an algorithm for determining syllable boundaries in the orthographic form of unknown words that works by analogical reasoning from a database or corpus of known syllabifications. We call this syllabification by analogy (SbA). It is similarly motivated to our existing pronunciation by analogy (PbA) which predicts pronunciations for unknown words (specified by their spellings) by inference from a dictionary of known word spellings and corresponding pronunciations. We show that including perfect (according to the corpus) syllable boundary information in the orthographic input can dramatically improve the performance of pronunciation by analogy of English words, but such information would not be available to a practical system. So we next investigate combining automatically-inferred syllabification and pronunciation in two different ways: the series model in which syllabification is followed sequentially by pronunciation generation; and the parallel model in which syllabification and pronunciation are simultaneously inferred. Unfortunately, neither improves performance over PbA without syllabification. Possible reasons for this failure are explored via an analysis of syllabification and pronunciation errors.

Read full abstract

Automatic Speech Recognition Models Research Articles

Related Topics

Articles published on Automatic Speech Recognition Models

Hidden Markov Acoustic Modeling With Bootstrap and Restructuring for Low-Resourced Languages

Semi-supervised learning for speech and audio processing

Collecting and evaluating speech recognition corpora for 11 South African languages

An Adaptive Utterance Verification Framework Using Minimum Verification Error Training

Dual stream speech recognition using articulatory syllable models

Characterisation and identification of non-native French accents

Dealing with noise in automatic speech recognition.

Phrase classes in two-level language models for ASR

Active learning process for spoken dialog systems

Integration of Speech Recognition and Machine Translation in Computer-Assisted Translation

A Portable robot audition software system for multiple simultaneous speech signals

Broader range of training voices improves performance of HMM model of phonemic identification

Environmental Independent ASR Model Adaptation/Compensation by Bayesian Parametric Representation

Method and apparatus for rejection of speech recognition results in accordance with confidence level

Testing the performance of spoken dialogue systems by means of an artificially simulated user

On the relevance of some spectral and temporal patterns for vowel classification

A probabilistic framework for dialog simulation and optimal strategy learning

Can syllabification improve pronunciation by analogy of English?

System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies

On second-order statistics of log-periodogram with correlated components

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Automatic Speech Recognition Models Research Articles

Related Topics

Articles published on Automatic Speech Recognition Models

Hidden Markov Acoustic Modeling With Bootstrap and Restructuring for Low-Resourced Languages

Semi-supervised learning for speech and audio processing

Collecting and evaluating speech recognition corpora for 11 South African languages

An Adaptive Utterance Verification Framework Using Minimum Verification Error Training

Dual stream speech recognition using articulatory syllable models

Characterisation and identification of non-native French accents

Dealing with noise in automatic speech recognition.

Phrase classes in two-level language models for ASR

Active learning process for spoken dialog systems

Integration of Speech Recognition and Machine Translation in Computer-Assisted Translation

A Portable robot audition software system for multiple simultaneous speech signals

Broader range of training voices improves performance of HMM model of phonemic identification

Environmental Independent ASR Model Adaptation/Compensation by Bayesian Parametric Representation

Method and apparatus for rejection of speech recognition results in accordance with confidence level

Testing the performance of spoken dialogue systems by means of an artificially simulated user

On the relevance of some spectral and temporal patterns for vowel classification

A probabilistic framework for dialog simulation and optimal strategy learning

Can syllabification improve pronunciation by analogy of English?

System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies

On second-order statistics of log-periodogram with correlated components