Speech Synthesis of Children’s Reading Based on CycleGAN Model

Ning Jia,Chunjun Zheng,Wei Sun

doi:10.1088/1742-6596/1607/1/012046

Ning Jia, Chunjun Zheng + Show 1 more

Open Access

https://doi.org/10.1088/1742-6596/1607/1/012046

Copy DOI

Abstract

The generation of emotional speech is a challenging and widely applied research topic in the field of speech processing. Because the design method of effective speech feature expression and generation model directly affects the accuracy of emotional speech generation, it is difficult to find a general solution of emotional speech synthesis. In this paper, the CycleGAN model is used as the starting point, and the improved convolution neural network (CNN) model and identity mapping loss scheme are used to achieve effective timing information capture. At the same time, we learn the positive mapping and the reverse mapping to find the best matching design scheme, and retain the speech information in this process, without relying on other audio data. Experiments show that the emotional speech can be accurately recognized by comparing the speech emotion before and after the improvement on the speech corpus of children’s reading. By comparing with the common emotional speech generation model, the advantages of the model proposed in this paper are verified.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Physics: Conference Series	Publication Date: Aug 1, 2020
Citations: 8	License type: cc-by

R Discovery Prime

R Discovery Prime

Speech Synthesis of Children’s Reading Based on CycleGAN Model

Abstract

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech Synthesis of Children’s Reading Based on CycleGAN Model

Abstract

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series