Abstract
We present in this paper a voice conversion (VC) method for a person with an articulation disorder resulting from athetoid cerebral palsy. The movement of such speakers is limited by their athetoid symptoms, and their consonants are often unstable or unclear, which makes it difficult for them to communicate. In this paper, exemplar-based spectral conversion using nonnegative matrix factorization (NMF) is applied to a voice with an articulation disorder. To preserve the speaker’s individuality, we used an individuality-preserving dictionary that is constructed from the source speaker’s vowels and target speaker’s consonants. Using this dictionary, we can create a natural and clear voice preserving their voice’s individuality. Experimental results indicate that the performance of NMF-based VC is considerably better than conventional GMM-based VC.
Highlights
In recent years, a number of assistive technologies using information processing have been proposed, for example, sign language recognition using image recognition technology [1,2,3], text reading systems from natural scene images [4,5,6], and the design of wearable speech synthesizers [7]
The difference of mean opinion score (MOS) between nonnegative matrix factorization (NMF)-based voice conversion (VC) and Gaussian mixture model (GMM)-based VC was confirmed by a p value test of 0.05
The p value test of 0.05 showed that there are i k i oi no significant differences between GMM-based VC and utterances of a person with an articulation disorder
Summary
A number of assistive technologies using information processing have been proposed, for example, sign language recognition using image recognition technology [1,2,3], text reading systems from natural scene images [4,5,6], and the design of wearable speech synthesizers [7]. We focused on a person with an articulation disorder resulting from athetoid cerebral palsy. There are about 34,000 people with speech impediments associated with an articulation disorder in Japan alone, and one of the causes of speech impediments is cerebral palsy. Cerebral palsy is a result of damage to the central nervous system, and the damage causes movement disorders. Three general times are given for the onset of the disorder: before birth, at the time of delivery, and after birth. Cerebral palsy is classified into the following types: (1) spastic, (2) athetoid, (3) ataxic, (4) atonic, (5) rigid, and (6) a mixture of these types [8]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: EURASIP Journal on Audio, Speech, and Music Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.