Compact Acoustic Models for Embedded Speech Recognition

Christophe Lévy,Jean-François Bonastre,Georges Linarès

doi:10.1155/2009/806186

Christophe Lévy, Jean-François Bonastre + Show 1 more

Open Access

https://doi.org/10.1155/2009/806186

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text
Similar Papers

Abstract

Listen

Speech recognition applications are known to require a significant amount of resources. However, embedded speech recognition only authorizes few KB of memory, few MIPS, and small amount of training data. In order to fit the resource constraints of embedded applications, an approach based on a semicontinuous HMM system using state-independent acoustic modelling is proposed. A transformation is computed and applied to the global model in order to obtain each HMM state-dependent probability density functions, authorizing to store only the transformation parameters. This approach is evaluated on two tasks: digit and voice-command recognition. A fast adaptation technique of acoustic models is also proposed. In order to significantly reduce computational costs, the adaptation is performed only on the global model (using related speaker recognition adaptation techniques) with no need for state-dependent data. The whole approach results in a relative gain of more than 20% compared to a basic HMM-based system fitting the constraints.

Highlights

The amount and the diversity of services offered by the latest generation of mobile phones has increased significantly during the last decade, and these new services are considered as crucial points by the manufacturers in terms of both functionalities and marketing impact
With the compact model, we note a decrease of the Digit Error Rate (DER) from 4.32% to 2.17% which corresponds to a relative decrease of about 50%
DER is between 2.17% and 2.83%

Summary

Introduction

The amount and the diversity of services offered by the latest generation of mobile phones (and similar embedded devices) has increased significantly during the last decade, and these new services are considered as crucial points by the manufacturers in terms of both functionalities and marketing impact. Most of the recent ASR systems rely on Gaussian or state sharing, where parameter tying reduces computational time and the memory footprint, whilst providing an efficient way of estimating large contextdependent models [4,5,6]. We present a new acoustic-model architecture where parameters are massively factored, with the purpose of reducing the memory footprint of an embedded ASR system whilst preserving the recognition accuracy. This factoring relies on a multi-level modelling scheme where a universal background model can be successively specialized to environment, speaker, and acoustic units.

The Proposed Approach

Corpora

Application Independent Corpus

Application Dependent Corpora

Baseline Systems

The Approach Proposed

Fast Acoustic Adaptation

Conclusion

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EURASIP Journal on Audio, Speech, and Music Processing	Publication Date: Jan 1, 2009
Citations: 7	License type: cc-by

R Discovery Prime

Compact Acoustic Models for Embedded Speech Recognition

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing

Lead the way for us

Similar Papers

GMM-based acoustic modeling for embedded speech recognition
Christophe Lévy ... Georges Linarès
-
Christophe Lévy, et. al.Christophe Lévy ... Georges Linarès
17 Sep 2006
17 Sep 2006

Embedded Mobile Phone Digit-Recognition
Christophe Lévy ... Pascal Nocera
-
Christophe Lévy, et. al.Christophe Lévy ... Pascal Nocera
01 Jan 2007
01 Jan 2007

Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor
Panikos Heracleous ... Kiyohiro Shikano
EURASIP Journal on Advances in Signal Processing | VOL. 2007
Panikos Heracleous, et. al.Panikos Heracleous ... Kiyohiro Shikano
27 Sep 2006
EURASIP Journal on Advances in Signal Processing | VOL. 2007

Seasonality indices for regionalizing low flows
G Laaha ... G Blöschl
Hydrological Processes | VOL. 20
G Laaha, et. al.G Laaha ... G Blöschl
01 Jan 2006
Hydrological Processes | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Compact Acoustic Models for Embedded Speech Recognition

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing