Exemplar-based emotional voice conversion using non-negative matrix factorization

Ryo Aihara,Reina Ueda,Yasuo Ariki,Tetsuya Takiguchi

doi:10.1109/apsipa.2014.7041640

Abstract

This paper presents an emotional voice conversion (VC) technology using non-negative matrix factorization, where parallel exemplars are introduced to encode the source speech signal and synthesize the target speech signal. The input source spectrum is decomposed into the source spectrum exemplars and their weights. By replacing source exemplars with target exemplars, the converted spectrum and FO are constructed from the target exemplars and the target FO, which is paired with exemplars. In order to reduce the computational time, we adopted non-negative matrix factorization using active Newton set algorithms to our VC method. We carried out emotional voice conversion tasks, which convert an emotional voice into a neutral voice. The effectiveness of this method was confirmed with objective and subjective evaluations.

Full Text