Abstract

The paper introduces a method for automatic translation of Vietnamese text into Muong speech in two dialects, Muong Bi - Hoa Binh and Muong Tan Son - Phu Tho, which are all unwritten dialects of the Muong language. Due to the very close relationship between the Vietnamese and Muong languages, the translation system was built to look like a cross-lingual speech synthesis system, in which the input is the text of one language (i.e., the Vietnamese) and the output is the speech of another language (i.e., the two Muong dialects). The system used the modern sequence-to-sequence TTS neural models Tacotron2 and WaveGlow. The evaluation results showed a high quality of translation (with a fluency score of 4.61/5.0 and an adequacy score of 4.79/5.0) and also synthesized speech quality (with naturalness on the MOS scale of 4.68/5.0 and intelligibility of 94.60%). The received results show that the applicability of the proposed system to other minority languages is promising, especially in the case of unwritten languages.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call