Abstract

A computational algorithm is presented which locates syllabic boundaries in connected speech. Filtered, digitized speech is processed in two stages. The first stage is an auditory front end which filters the digitized speech into critical bandwidths. The critical bands are amplitude-scaled according to a standard frequency/amplitude function, then low-frequency amplitude modulations in the 2–30-Hz range are emphasized according to a perceptual modulation sensitivity function. Next, the critical bands are low-pass filtered at 100 Hz and decimated at a rate of 25:1. Then they are low-pass filtered at 100 Hz, again creating acoustic envelopes of the processed speech, one for each critical band. During the second stage of processing, an autocorrelation-based algorithm is run on the envelopes. The local minima of this algorithm are pooled across the critical-band envelopes, yielding syllabic boundaries. The utterances used to develop and test this algorithm were taken from the Harvard Phonetically Balanced Sentences. Currently, the algorithm places boundaries within a few tens of milliseconds of where a phonemic syllabification would place them. Work continues on fine-tuning the algorithm. Another goal is to compare the algorithm’s performance against human listeners’. [The author wishes to acknowledge the support of Lucent Technologies in conducting this project.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.