Abstract

ESOPE0, the first version of our speech recognition system, uses a top‐down strategy from the pragmatic level to the phonetic one, and operates from left to right with a best‐few method and no‐backtracking. Dynamic comparison among the four best phoneme‐candidates is carried out. ESOPE1 uses the same basic strategy in a systematic way: A best‐few algorithm leads to a beam‐search procedure. ESOPE1‐1 employs a top‐down treatment down to the acoustic level with a diphone dictionary. It uses a dynamic comparison method at the acoustic level. In our automatic dictation project, using a natural language syntax and a 170 000‐form vocabulary, a bottom‐up, best‐few attitude has been taken to translate into words an error‐free continuous phoneme string. We therefore feel that severely limited language and poor phoneme recognition involve a top‐down strategy, whereas a bottom‐up strategy is preferable in the opposite situation. This, and the recent results in psycholinguistics, lead us, in our present elaboration of ESOPE2, to the use of both a top‐down, and a bottom‐up strategy (Prediction‐Verification‐Induction). Predictions are made at each level, but the recognized phonemes may introduce unpredicted words, to allow limited learning abilities.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call