Abstract

This paper presents an emotional voice conversion (VC) technology using non-negative matrix factorization, where parallel exemplars are introduced to encode the source speech signal and synthesize the target speech signal. The input source spectrum is decomposed into the source spectrum exemplars and their weights. By replacing source exemplars with target exemplars, the converted spectrum and FO are constructed from the target exemplars and the target FO, which is paired with exemplars. In order to reduce the computational time, we adopted non-negative matrix factorization using active Newton set algorithms to our VC method. We carried out emotional voice conversion tasks, which convert an emotional voice into a neutral voice. The effectiveness of this method was confirmed with objective and subjective evaluations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call