Phoneix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation With Phoneme Distribution Predictor

Yuning Wu,Qin Jin,Dongji Gao,Jiatong Shi,Tao Qian

doi:10.1109/icassp49357.2023.10097204

Phoneix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation With Phoneme Distribution Predictor

Yuning Wu, Qin Jin + Show 3 more

Open Access

https://doi.org/10.1109/icassp49357.2023.10097204

Copy DOI

Publication Date: Jun 4, 2023

Affiliation: Renmin University of China, Johns Hopkins University, Carnegie Mellon University

#Singing Voice Synthesis #Music Score + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Singing voice synthesis (SVS), as a specific task for generating the vocal singing voice from a music score, has drawn much attention in recent years. SVS faces the challenge that the singing has various pronunciation flexibility conditioned on the same music score. Most of the previous works of SVS can not well handle the misalignment between the music score and actual singing. In this paper, we propose an acoustic feature processing strategy, named PHONEix, with a phoneme distribution predictor, to alleviate the gap between the music score and the singing voice, which can be easily adopted in different SVS systems. Extensive experiments in various settings demonstrate the effectiveness of our PHONEix in both objective and subjective evaluations.

Full Text