Abstract

The STRAIGHT time‐frequency representations (spectrograms) of singing voice signals in various musical notes and various tempos are observed to develop a high‐quality synthesis system of singing voice. The spectrogram of STRAIGHT, which is a very high‐quality analysis‐synthesis system, can represent the vocal tract information accurately. A conversion system of a musical note or a tempo of an input singing voice signal has been implemented based on the observation. As a result of the observation, the frequency warping of the STRAIGHT spectrogram based on a DP matching algorithm has been introduced into the system. It was found that the method using a differential of a smoothed spectrum as a spectral distance measure in the frequency warping produces subjectively better quality than that using a smoothed spectrum directly. It was also found that the method without spectral modification, i.e., only with pitch/tempo modification in the conversion, produces better quality than that using a differential of a smoothed spectrum. This can be caused by the destruction of naturalness in the method using a differential of a smoothed spectrum.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call