Implementing a Statistical Parametric Speech Synthesis System for a Patient with Laryngeal Cancer.

Krzysztof Szklanny,Jakub Lachowicz

doi:10.3390/s22093188

Abstract

Total laryngectomy, i.e., the surgical removal of the larynx, has a profound influence on a patient’s quality of life. The procedure results in a loss of natural voice, which in effect constitutes a significant socio-psychological problem for the patient. The main aim of the study was to develop a statistical parametric speech synthesis system for a patient with laryngeal cancer, on the basis of the patient’s speech samples recorded shortly before the surgery and to check if it was possible to generate speech quality close to that of the original recordings. The recording made use of a representative corpus of the Polish language, consisting of 2150 sentences. The recorded voice proved to indicate dysphonia, which was confirmed by the auditory-perceptual RBH scale (roughness, breathiness, hoarseness) and by acoustical analysis using AVQI (The Acoustic Voice Quality Index). The speech synthesis model was trained using the Merlin repository. Twenty-five experts participated in the MUSHRA listening tests, rating the synthetic voice at 69.4 in terms of the professional voice-over talent recording, on a 0–100 scale, which is a very good result. The authors compared the quality of the synthetic voice to another model of synthetic speech trained with the same corpus, but where a voice-over talent provided the recorded speech samples. The same experts rated the voice at 63.63, which means the patient’s synthetic voice with laryngeal cancer obtained a higher score than that of the talent-voice recordings. As such, the method enabled for the creation of a statistical parametric speech synthesizer for patients awaiting total laryngectomy. As a result, the solution would improve the quality of life as well as better mental wellbeing of the patient.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors	Publication Date: Apr 21, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Implementing a Statistical Parametric Speech Synthesis System for a Patient with Laryngeal Cancer.

Abstract

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

Effect of Age and Gender on Acoustic Voice Quality Index Across Lifespan: A Cross-sectional Study in Indian Population
T Jayakumar ... H Mohamed Yasin
Journal of Voice | VOL. 36
T Jayakumar, et. al.T Jayakumar ... H Mohamed Yasin
27 Jun 2020
Journal of Voice | VOL. 36

Validation of the Acoustic Voice Quality Index in the Russian language
E.L Choynzonov ... L.A Kononova
Vestnik otorinolaringologii | VOL. 87
E.L Choynzonov, et. al.E.L Choynzonov ... L.A Kononova
01 Jan 2021
Vestnik otorinolaringologii | VOL. 87

Validation of Acoustic Voice Quality Index Version 3.01 and Acoustic Breathiness Index in Korean Population
Geun-Hyo Kim ... Yeon-Woo Lee
Journal of Voice | VOL. 35
Geun-Hyo Kim, et. al.Geun-Hyo Kim ... Yeon-Woo Lee
07 Nov 2019
Journal of Voice | VOL. 35

Influence of the Voice Sample Length in Perceptual and Acoustic Voice Quality Analysis
Marina Englert ... Mara Behlau
Journal of Voice | VOL. 36
Marina Englert, et. al.Marina Englert ... Mara Behlau
10 Aug 2020
Journal of Voice | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Implementing a Statistical Parametric Speech Synthesis System for a Patient with Laryngeal Cancer.

Abstract

Talk to us

Similar Papers

More From: Sensors