Abstract

AbstractThis paper proposes a new methodology for automatically comparing the speech rhythm structure of two utterances. Eleven parameters were automatically extracted from 44 pairs of audiofiles yielding 11-size difference vectors. The parameters include speech rate, duration-related stress group rate, prominence and prosodic boundary strength, f0 peak rate, as well as the coupling strength between underlying syllable and stress group oscillators. The 11-parameter difference vectors were used to infer the perceptual differences identified by a group of 10 listeners who judged the same 44 pairs of audiofiles . The results indicate that duration-related prominence or prosodic boundary rate and speech rate, taken together, predict up to 71 % of the response variance. To a minor extent, prominence/boundary strength mean and non-prominent VV unit rate predict up to 60 % of the response variance when combined with prominence or prosodic boundary rate.Keywordsspeech rhythmprominencerhythm perceptionspeech rate

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.