Rhythm Speech Lyrics Input for MIDI-Based Singing Voice Synthesis

Hong-Ru Lee,Wen-Nan Wang,Chih-Hao Hsu,Chih-Fang Huang

doi:10.1007/978-3-642-10467-1_40

Abstract

This paper presents useful techniques and considerations in implementing underlying mandarin singing voice synthesis system using the RSLI unit. The system can receive the continuous speech of the lyrics of a song, and can synthesize the intended song based on the MIDI-based music database. This system is designed based on 3 units.. The first one is the input unit which allows the user specifies a musical score and phonetically-spelled lyrics to system. The second one is the modified unit and it is employed to implement the pitch-shifting function using the PSOLA method. The third one is the mixed unit which has some undesirable artificial-sounding buzzy-effects, including echo and vibrato effects. Moreover, the energy, duration, and spectrum modifications are also implemented in the mixed unit. The synthesized singing voice sounds reasonably good. From the subjective listening test, the MOS (mean opinion score) of 3.3 and 3.2 are obtained for the synthesized singing voices and the similarity of singer's voice, respectively.

Full Text