Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion

Guanlong Zhao,Ricardo Gutierrez-Osuna

doi:10.1109/taslp.2019.2926754

Guanlong Zhao, Ricardo Gutierrez-Osuna

Open Access

https://doi.org/10.1109/taslp.2019.2926754

Copy DOI

Abstract

Accent conversion AC aims to transform non-native utterances to sound as if the speaker had a native accent. This can be achieved by mapping source speech spectra from a native speaker into the acoustic space of the target non-native speaker. In prior work, we proposed an AC approach that matches frames between the two speakers based on their acoustic similarity after compensating for differences in vocal tract length. In this paper, we propose a new approach that matches frames between the two speakers based on their phonetic rather than acoustic similarity. Namely, we map frames from the two speakers into a phonetic posteriorgram using speaker-independent acoustic models trained on native speech. We thoroughly evaluate the approach on a speech corpus containing multiple native and non-native speakers. The proposed algorithm outperforms the prior approach, improving ratings of acoustic quality 22% increase in mean opinion score and native accent 69% preference while retaining the voice quality of the non-native speaker. Furthermore, we show that the approach can be used in the reverse conversion direction, i.e., generating speech with a native speaker's voice quality and a non-native accent. Finally, we show that this approach can be applied to non-parallel training data, achieving the same accent conversion performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Oct 1, 2019
Citations: 14	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Similar Papers

Accent Conversion Using Phonetic Posteriorgrams
Guanlong Zhao ... Evgeny Chukharev-Hudilainen
-
Guanlong Zhao, et. al.Guanlong Zhao ... Evgeny Chukharev-Hudilainen
01 Apr 2018
01 Apr 2018

Vietnamese American Experiences of English Language Learning: Ethnic Acceptance and Prejudice
Jeffrey Labelle
Journal of Southeast Asian American Education and Advancement | VOL. 2
Jeffrey LabelleJeffrey Labelle
08 May 2015
Journal of Southeast Asian American Education and Advancement | VOL. 2

Comparing the Language Style Used by Native and Non-native English Speakers in The Ellen Show
Ade Dwi Cahyanti ... Rudi Hartono
English Education Journal | VOL. 11
Ade Dwi Cahyanti, et. al.Ade Dwi Cahyanti ... Rudi Hartono
23 Dec 2021
English Education Journal | VOL. 11

Pause Length and Differences in Cognitive State Attribution in Native and Non-Native Speakers
Theresa Matzinger ... Przemysław Żywiczyński
Languages | VOL. 8
Theresa Matzinger, et. al.Theresa Matzinger ... Przemysław Żywiczyński
13 Jan 2023
Languages | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing