Analysis of Chinese singing voices and its application to singing voice synthesis

Kenko Ota,Terumasa Ehara

doi:10.1121/1.4708736

Abstract

Currently, many researchers work on singing voice synthesis in Japanese or English etc. However, there are few researches on singing voice synthesis in Chinese. Thus, this research tackles development of a fundamental frequency (F0) controlling method for realizing a natural vocal conversion system from a Chinese speaking voice to a singing voice. Firstly, Chinese singing voices are analyzed in order to clarify the characteristics of F0 contour. From the analysis result of Chinese singing voices, it has been clarified that the F0 of Chinese singing voices is varied in accordance with not only the acoustic characteristics affecting the singing voice perception, e.g. overshoot, but also the four tones. Then, vocal conversion system is developed based on findings. In order to confirm the validity of the developed F0 controlling model, the following synthesized singing voices are subjectively evaluated by native Chinese evaluators. One is synthesized by controlling F0 contour according to the musical note, the second is synthesized by considering the acoustic characteristics affecting the singing voice perception and the third is synthesized by the proposed F0 controlling method. As the result, the singing voices synthesized by the proposed method realize high naturalness.

Full Text