Abstract

To improve the sound quality of speech synthesis technology in intelligent broadcasting, a deep neural network-based method is proposed. It also proved the effectiveness of the DNN discrimination s/u/v and completed the conversion of the HMM synthesis spectrum parameter to original speech. Further, the scheme for transforming the parameters obtained from the temporary decomposition (TD) algorithm, DNN trains the event vectors obtained from TD decomposition, establishes the transformation model, and recombines with the untransformed event function. Experiments proved that the conversion effect of 16 dimensional parameters is not very ideal in subjective evaluation due to the fact that too few dimensions lead to insufficient spectral details, and the distortion in the process of further synthesis; the parameter conversion of 48 dimensions is slightly better than 16 dimensions, mainly due to more spectral details, but on the other hand, the influence of codebook mapping also affects the sound instability to some extent. It proves that the intelligent voice broadcast system completely solves these problems, which not only reduces construction costs, but also improves service efficiency.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.