A HMM-based mandarin chinese singing voice synthesis system

Xian Li,Zengfu Wang

doi:10.1109/jas.2016.7451107

Abstract

We propose a mandarin Chinese singing voice synthesis system, in which hidden Markov model (HMM)-based speech synthesis technique is used. A mandarin Chinese singing voice corpus is recorded and musical contextual features are well designed for training. F0 and spectrum of singing voice are simultaneously modeled with context-dependent HMMs. There is a new problem, F0 of singing voice is always sparse because of large amount of context, i.e., tempo and pitch of note, key, time signature and etc. So the features hardly ever appeared in the training data cannot be well obtained. To address this problem, difference between F0 of singing voice and that of musical score (DF0) is modeled by a single Viterbi training. To overcome the over-smoothing of the generated F0 contour, syllable level F0 model based on discrete cosine transforms (DCT) is applied, F0 contour is generated by integrating two-level statistical models. The experimental results demonstrate that the proposed system outperforms the baseline system in both objective and subjective evaluations. The proposed system can generate a more natural F0 contour. Furthermore, the syllable level F0 model can make singing voice more expressive.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A HMM-based mandarin chinese singing voice synthesis system

Abstract

Talk to us

Similar Papers

More From: IEEE/CAA Journal of Automatica Sinica

Lead the way for us

Journal: IEEE/CAA Journal of Automatica Sinica	Publication Date: Apr 10, 2016
Citations: 7

Similar Papers

Mandarin-Speaking Children's Speech Recognition: Developmental Changes in the Influences of Semantic Context and F0 Contours.
Hong Zhou ... Yang Zhang
Frontiers in Psychology | VOL. 8
Hong Zhou, et. al.Hong Zhou ... Yang Zhang
28 Jun 2017
Frontiers in Psychology | VOL. 8

Sentence Context Differentially Modulates Contributions of Fundamental Frequency Contours to Word Recognition in Chinese-Speaking Children With and Without Dyslexia.
Linjun Zhang ... Yang Zhang
Frontiers in Psychology | VOL. 11
Linjun Zhang, et. al.Linjun Zhang ... Yang Zhang
03 Dec 2020
Frontiers in Psychology | VOL. 11

A novel methodology of cardiac arrhythmia classification based on ECG and context-dependent HMM
Wenjing Wai ... Pengyuan Zhang
-
Wenjing Wai, et. al.Wenjing Wai ... Pengyuan Zhang
01 Dec 2019
01 Dec 2019

Pitch adaptive training for hmm-based singing voice synthesis
Keiichiro Oura ... Yoshihiko Nankaku
-
Keiichiro Oura, et. al.Keiichiro Oura ... Yoshihiko Nankaku
01 Mar 2012
01 Mar 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A HMM-based mandarin chinese singing voice synthesis system

Abstract

Talk to us

Similar Papers

More From: IEEE/CAA Journal of Automatica Sinica