Abstract

Speech synthesis, also known as text to speech (TTS), is a technology to convert text into sound, which is also an important technology to realize the communication between human and machine. English is an international language. It is necessary to study English speech synthesis technology. Aiming at English speech synthesis, an end-to-end speech synthesis method and system based on convolutional neural network WaveRNN and WaveNet is explained used in this paper. Experiments show that the Mean Opinion Score (MOS) of the synthesized speech is 3.32, and the speech quality is better than that of the general parametric speech synthesis system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call