A novel approach for modeling non-keyword intervals in a keyword spotter exploiting acoustic similarities of languages

Panikos Heracleous,Tohru Shimizu

doi:10.1016/j.specom.2004.10.016

Abstract

In this paper, we present a new keyword spotting technique. A critical issue in keyword spotting is the explicit modeling of the non-keyword portions. To date, most keyword spotters use a set of Hidden Markov Models (HMM) to represent the non-keyword portions. A widely used approach is to split the training data into keyword and non-keyword data. The keywords are represented by HMMs trained using the keyword speech, and the garbage models are trained using the non-keyword speech. The main disadvantage of this method is the task dependence. Another approach is to use a common set of acoustic models for both keywords and garbage models. However, this method faces a major problem. In a keyword spotter, the garbage models are usually connected to allow any sequence. Therefore, the keywords are also included in these sequences. When the same training data are used for keyword and garbage models, the garbage models also cover the keywords. In order to overcome these problems, we propose a new method for modeling the non-keyword intervals. In our method, the garbage models are phonemic HMMs trained using a speech corpus of a language other than—but acoustically similar to—the target language. In our work, the target language is Japanese and, due to the high similarity, English was chosen as the ‘garbage language’ for training the garbage models. Using English garbage models—instead of Japanese—our method achieves higher performance, compared with when Japanese garbage models are used. Moreover, parameter tuning (e.g., word insertion penalty) does not have a serious effect on the performance when English garbage models are used. Using clean telephone speech test data and a vocabulary of 100 keywords, we achieved a 7.9% equal error rate which is a very promising result. In this paper we also introduce results obtained using several sizes of vocabulary, and we investigate the selection of the most appropriate garbage model set. In addition to the Japanese keyword spotting system, we also introduce results of an English keyword spotter. By using Japanese garbage models—instead of English—we achieved significant improvement. Using telephone speech test data and a vocabulary of 25 keywords the achieved Figure of Merit (FOM) was 74.7% compared to 68.9% when English garbage models were used.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel approach for modeling non-keyword intervals in a keyword spotter exploiting acoustic similarities of languages

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Journal: Speech Communication	Publication Date: Jan 19, 2005
Citations: 3

Similar Papers

HMM based fast keyworld spotting algorithm with no garbage models
S Sunil ... T.V Sreenivas
-
S Sunil, et. al.S Sunil ... T.V Sreenivas
09 Sep 1997
09 Sep 1997

An approach to intelligent information filtering in Chinese document images based on garbage model
Jiewei Chen ... Jun Guo
-
Jiewei Chen, et. al. Jiewei Chen ... Jun Guo
20 Oct 2004
20 Oct 2004

Keyword Spotting using Vowel Onset Point, Vector Quantization and Hidden Markov Modeling Based techniques
B V Sandeep Reddy ... S R Mahadeva Prasanna
-
B V Sandeep Reddy, et. al.B V Sandeep Reddy ... S R Mahadeva Prasanna
01 Nov 2008
01 Nov 2008

Noise Condition-Dependent Training Based on Noise Classification and SNR Estimation
Haitian Xu ... Zheng-Hua Tan
IEEE Transactions on Audio, Speech and Language Processing | VOL. 15
Haitian Xu, et. al.Haitian Xu ... Zheng-Hua Tan
01 Nov 2007
IEEE Transactions on Audio, Speech and Language Processing | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel approach for modeling non-keyword intervals in a keyword spotter exploiting acoustic similarities of languages

Abstract

Talk to us

Similar Papers

More From: Speech Communication