Abstract

A connectionist structure for phoneme recognition in continuous speech is described. This has two main parts. The first is a sound subunit classifier in the form of a three-layer back propagation network which classifies speech subunits from frames of spectral speech data. This is followed by a sequence classifier in the form of a network of neural like-units which classifies phonemes from input sequences of subunits by their occurrence and duration. Results are given for a 15-phoneme subset of British English, for a single speaker. These include the difficult syllable initial and final stop consonants, fricatives, vowels, and diphthongs. The overall recognition accuracy achieved is 87%. >

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call