Abstract

AbstractThis paper presents a new architecture for word spotting, one of the most important technologies in continuous speech recognition. If the word spotting algorithm can automatically create standard word and phrase models from phonemic segment models smaller than single words, it is able to facilitate large vocabulary spotting. It is also flexible in accommodating acoustic variation in phonetic units within continuously spoken speech.The word spotting algorithm proposed here has the following features: (1) Local spectrum sequence in speech are referred to as “acoustic events,” and each phonetic segment is represented as a sequence of these events. Local spectrum variations are accommodated in terms of probabilistic models, and variations in the temporal structure of phonemic segments are handled by using dynamic time warping matching method for event sequences. (2) In continuous speech used for learning, the event sequence that make up the phonemic segment are automatically learned through an iterative process of altering the boundaries of phonemic segments. (3) Phonetic strings in phonetic orthography may be joined into phonemic segments appropriate to the phonemic environment, automatically creating any desired standard patterns for words or phrases. The performance of this algorithm is demonstrated in phoneme recognition, word spotting and phrase spotting.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call