Abstract

This paper considers the generation of feedback utterances for speaking skills training of non-native English learners. The proposed feedback is in the form of a combination of the learner's voice and the linguistic gestures, i.e., the prosody or pronunciation, of a native speaker. Both accent reduction method and voice conversion method are employed to generate feedback stimuli. For accent reduction, three speech synthesis methods, namely pitch-synchronous overlap and add (PSOLA), harmonic stochastic model (HSM), and speech transformation and representation by adaptive interpolation of weighted spectrogram (STRAIGHT) are used to reduce the accent of the utterances of English learners. For voice conversion, the teacher's voice is converted to that of the learner and the converted speech is used as a feedback. Objective measurements are employed to assess the nativeness and acoustic quality of the generated stimuli. A feedback scheme which combines the accent reduction and voice conversion methods is also proposed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.