Abstract

A computer program to segment speech into syllables is now a part of the Lincoln Laboratory speech-recognition system. It is intended to facilitate processing of polysyllabic words spoken in isolation, and, eventually, connected speech. The program does not attempt to fix boundaries precisely in time. Evaluation of the results is made with reference to the aural impression that a sequence of speech sounds makes on the investigator, the aim being to insure agreement between program and investigator for those segments at which the latter is certain that there is or is not boundary, and to confine disagreements to those segments at which the investigator is uncertain. Decisions are made at 3 levels, the easiest being made at the first level. Syllable boundaries are first marked at transitions from voiced to voiceless segments, voicing being determined according to the ratio of amplitude in a lowpass band to the total speech amplitude. Then the stretches of continuously voiced speech thus marked are processed. Dips in the over-all amplitude large enough to indicate unambiguously a syllable boundary are marked as boundaries. At the third level, the voiced segments defined by the boundaries thus far determined are further processed. More-detailed characteristics of the spectrum are used to make decisions in the most difficult cases, vowel-semivowel-vowel combinations and the monosyllabic diphthongs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.