Abstract

ABSTRACTFor learning foreign language pronunciation, prosodic fea-tures are important as much as, or more than segmen-tal features. For Japanese speakers, one of difficulties tolearn English pronunciation is rhythm because of the dif-ferences between two languages: mora-timing rhythm andstress-timing rhythm. In order to correct errors in rhythm,the method of evaluating sentence stress that constitutesrhythm is very significant. In this paper, we present amethod of automatic detecting sentence stress syllables forthe evaluation criterion. Using a linear discriminant func-tion of pitch, intensity and vowel duration, about 90% ofthe syllables were correctly detected as to sentence stress.Also we analyzed the different and common characteristicsamong different English native speakers. The results re-vealed that the perception of the sentence stress among 11native speakers had general agreement with respect to howto integrate three features.1. INTRODUCTIONProsodic features play an important role in human commu-nications as to make clear focal point of topics, and alsoto emphasize or express one’s intention. However, the dif-ferences between English and Japanese in the expressionsof stress and rhythm cause serious difficulties for learnersmastering prosodic patterns. Japanese speakers tend to ut-ter English pronunciation in monotonous rhythm, whichoccasionally misleads their communication. So effectiveevaluationand instructionofrhythmare essentialtoacquirecorrect English pronunciation.Recent advancement in speech technology enables usto develop a computer-assisted language learning (CALL)system. In the previous studies[1][2][3], effective evalua-tion criteria for segmental features or intonation were pro-posed. Also, the method of evaluating and instructing En-glish word accent for Japanese was proposed[4]. However,the evaluation criterion of sentence stress, which is one ofthe most essential factors of English rhythm, is not estab-lished. So we propose a method of automatic detectingsentence stress syllables for a base of an effective CALLsystem.It is known that English syllable stresses consist ofpitch, intensity and duration. These features correspondto fundamental frequency, power and vowel duration, re-spectively. In [6], the rhythm instruction was realized byjudging the stress with duration and vowel quality, not withpitch or intensity. According to [8], pitch and power alsobecome key factors of sentence stress. So we utilized threefeatures to detect sentence stresses. However, it is not clearwhich acoustic feature is the most important or how thesefeatures are integrated when English native speakers per-ceive sentence stresses. We investigate these issues and tryto establish a universal evaluation criterion of English sen-tence stresses.In order to estimate the appearance of sentence stresssyllables from contours of fundamental frequency andpower, each acoustic feature is to be normalized and quan-tized by syllable units. Then we adopt a linear discrim-inant function that integrates these acoustic features withweights. The weights, which reflect the importance of eachacoustic feature in perceiving English sentence stresses,are estimated by discriminant analysis using the TIMITdatabase.From the degree of agreement between natives’ per-ception and our method by a linear discriminant function,we verify the validity of our model. Also by comparisonwith the weights that are estimated from different natives’labels, we see the common and different characteristicsamong different natives’ perception.2. SPEECH MATERIALAs materials for the study, we picked up 310 sentencesproduced by English native speakers with New Englanddialect from the TIMIT database. Those speech sampleswere labeled by eleven English native speakers. From theirperception, we eliminated inadequate speech samples thathad neutral declarative rhythm to improve the reliability ofspeech materials and those labels.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.