Meta-Learning for Mandarin-Tibetan Cross-Lingual Speech Synthesis

Weizhao Zhang,Hongwu Yang

doi:10.3390/app122312185

Abstract

The paper proposes a meta-learning-based Mandarin-Tibetan cross-lingual text-to-speech (TTS) to realize both Mandarin and Tibetan speech synthesis under a unique framework. First, we build two kinds of Tacotron2-based Mandarin-Tibetan cross-lingual baseline TTS. One is a shared encoder Mandarin-Tibetan cross-lingual TTS, and another is a separate encoder Mandarin-Tibetan cross-lingual TTS. Both baseline TTS use the speaker classifier with a gradient reversal layer to disentangle speaker-specific information from the text encoder. At the same time, we design a prosody generator to extract prosodic information from sentences to explore syntactic and semantic information adequately. To further improve the synthesized speech quality of the Tacotron2-based Mandarin-Tibetan cross-lingual TTS, we propose a meta-learning-based Mandarin-Tibetan cross-lingual TTS. Based on the separate encoder Mandarin-Tibetan cross-lingual TTS, we use an additional dynamic network to predict the parameters of the language-dependent text encoder that could realize better cross-lingual knowledge sharing in the sequence-to-sequence TTS. Lastly, we synthesize Mandarin or Tibetan speech through the unique acoustic model. The baseline experimental results show that the separate encoder Mandarin-Tibetan cross-lingual TTS could handle the input of different languages better than the shared encoder Mandarin-Tibetan cross-lingual TTS. The experimental results further show that the proposed meta-learning-based Mandarin-Tibetan cross-lingual speech synthesis method could effectively improve the voice quality of synthesized speech in terms of naturalness and speaker similarity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Nov 28, 2022
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Meta-Learning for Mandarin-Tibetan Cross-Lingual Speech Synthesis

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Deep Learning for Mandarin-Tibetan Cross-Lingual Speech Synthesis
Weizhao Zhang ... Lili Wang
IEEE Access | VOL. 7
Weizhao Zhang, et. al.Weizhao Zhang ... Lili Wang
01 Jan 2019
IEEE Access | VOL. 7

A DNN-based Mandarin-Tibetan cross-lingual speech synthesis
Weitong Guo ... Zhenye Gan
-
Weitong Guo, et. al.Weitong Guo ... Zhenye Gan
01 Nov 2018
01 Nov 2018

Research on Tibetan Speech Synthesis Based on Fastspeech2
Ba Zu ... Zhijie Cai
-
Ba Zu, et. al.Ba Zu ... Zhijie Cai
22 Jul 2022
22 Jul 2022

Improving Sequence-to-sequence Tibetan Speech Synthesis with Prosodic Information
Weizhao Zhang ... Hongwu Yang
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 22
Weizhao Zhang, et. al.Weizhao Zhang ... Hongwu Yang
22 Sep 2023
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Meta-Learning for Mandarin-Tibetan Cross-Lingual Speech Synthesis

Abstract

Talk to us

Similar Papers

More From: Applied Sciences