Abstract

Phoneme sequence constraints, such as /b l g/, that occur across word boundaries (considerable growth) but not word internally were used for word‐boundary identification from continuous speech transcriptions. All three‐phoneme, word‐boundary sequences were derived by pairing P♯ (all word‐final phonemes) with ♯PP (all word‐initial, two‐phoneme sequences) and PP♯ with ♯P in a 23 000‐word lexicon including 70 000 reduced phonological forms. The resulting P♯PP and PP♯P sequences were matched against the same lexicon to determine which of them were excluded word internally. All boundary sequences excluded word internally were compiled into a tree structure that was matched against three sets of phonemic transcriptions corresponding to “fast,” “normal,” and “citation” speech production styles of 145 utterances. The results showed a word‐boundary detection rate of between 39% and 50% in the three transcription sets. The incorporation of phoneme sequence constraints also reduces the number of word parsings compared with an earlier left‐to‐right chart‐parsing approach for word‐boundary identification [J. M. Harrington and A. Johnstone, Comput. Speech Lang. (1988), in press].

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.