Spectral and pitch modeling with hybrid approach to singing voice synthesis using hidden semi-Markov model and deep neural network

Kouki Hongo,Takashi Nose,Akinori Ito

doi:10.1121/1.4969155

Abstract

We propose a corpus-based singing voice synthesis system combining the hidden Markov model (HMM) and the Deep Neural Network (DNN). Recently, in the area of text-to-speech synthesis, it was reported that the DNN-based speech synthesis method showed better speech quality than the HMM-based one. However, when we introduced the DNN to statistical singing voice synthesis, it did not improve the synthetic singing voice quality. Thus, we introduced the DNN in the singing voice synthesis in a different way. Instead of modeling the speech spectra, we exploited the DNN to model the difference between the spectra of natural singing voice and synthetic singing voice from the HMM. To do that, we used the DNN to map the input musical information such as lyrics, tones, durations into the difference of output acoustic features between the natural and synthetic singing voice. This allows us to reconstruct the spectral fine structures in singing voice generated by HMMs. Our results proved that the proposed method improved the quality of synthetic singing voice compared to the conventional methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spectral and pitch modeling with hybrid approach to singing voice synthesis using hidden semi-Markov model and deep neural network

Abstract

Talk to us

Similar Papers

More From: Journal of the Acoustical Society of America

Lead the way for us

Journal: Journal of the Acoustical Society of America	Publication Date: Oct 1, 2016
Citations: 2

Similar Papers

Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System
Yukiya Hono ... Keiichiro Oura
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 29
Yukiya Hono, et. al.Yukiya Hono ... Keiichiro Oura
01 Jan 2020
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 29

Integration of speaker and pitch adaptive training for HMM-based singing voice synthesis
Kanako Shirota ... Kei Hashimoto
-
Kanako Shirota, et. al.Kanako Shirota ... Kei Hashimoto
01 May 2014
01 May 2014

Singing Voice Synthesis Based on Generative Adversarial Networks
Yukiya Hono ... Keiichiro Oura
-
Yukiya Hono, et. al.Yukiya Hono ... Keiichiro Oura
01 May 2019
01 May 2019

Pitch adaptive training for hmm-based singing voice synthesis
Keiichiro Oura ... Keiichi Tokuda
-
Keiichiro Oura, et. al.Keiichiro Oura ... Keiichi Tokuda
01 Mar 2012
01 Mar 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spectral and pitch modeling with hybrid approach to singing voice synthesis using hidden semi-Markov model and deep neural network

Abstract

Talk to us

Similar Papers

More From: Journal of the Acoustical Society of America