Korean Prosody Phrase Boundary Prediction Model for Speech Synthesis Service in Smart Healthcare

Minho Kim,Hyuk-Chul Kwon,Youngim Jung

doi:10.3390/electronics10192371

Abstract

Speech processing technology has great potential in the medical field to provide beneficial solutions for both patients and doctors. Speech interfaces, represented by speech synthesis and speech recognition, can be used to transcribe medical documents, control medical devices, correct speech and hearing impairments, and assist the visually impaired. However, it is essential to predict prosody phrase boundaries for accurate natural speech synthesis. This study proposes a method to build a reliable learning corpus to train prosody boundary prediction models based on deep learning. In addition, we offer a way to generate a rule-based model that can predict the prosody boundary from the constructed corpus and use the result to train a deep learning-based model. As a result, we have built a coherent corpus, even though many workers have participated in its development. The estimated pairwise agreement of corpus annotations is between 0.7477 and 0.7916 and kappa coefficient (K) between 0.7057 and 0.7569. In addition, the deep learning-based model based on the rules obtained from the corpus showed a prediction accuracy of 78.57% for the three-level prosody phrase boundary, 87.33% for the two-level prosody phrase boundary.

Highlights

Speech processing technology has demonstrated great potential to provide beneficial solutions for both patients and doctors in smart healthcare
The voice interface represented by speech synthesis and speech recognition can be used to transcribe medical documents, control medical devices, mitigate speech and hearing impairments, and support the visually impaired
This study proposes a new methodology for the reliable prediction of prosodic breaks using linguistic knowledge and bi-gram information obtained from a small-scale corpus

Summary

Introduction

Speech processing technology has demonstrated great potential to provide beneficial solutions for both patients and doctors in smart healthcare. Recent advances in speech processing technology and other advanced technologies, including the Internet of Things (IoT) and communication systems, have significantly advanced contemporary healthcare systems [1,2,3]. The voice interface represented by speech synthesis and speech recognition can be used to transcribe medical documents, control medical devices, mitigate speech and hearing impairments, and support the visually impaired. It can be used as a biomarker in diagnosing psychological disorders. Environmental control assistance (e.g., device control, audio level control, nursing assistance requests, decision-making assistance) can aid in the recovery of patients with reduced mobility [7]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Korean Prosody Phrase Boundary Prediction Model for Speech Synthesis Service in Smart Healthcare

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Sep 28, 2021
License type: CC BY 4.0

Similar Papers

A case study on decompounding in Indian language IR
Siba Sankar Sahu ... Sukomal Pal
Natural Language Processing | VOL. -
Siba Sankar Sahu, et. al.Siba Sankar Sahu ... Sukomal Pal
03 Jun 2024
Natural Language Processing | VOL. -

Comparison between conventional and deep learning-based surrogate models in predicting convective heat transfer performance of U-bend channels
Qi Wang ... Kang Huang
Energy and AI | VOL. 8
Qi Wang, et. al.Qi Wang ... Kang Huang
25 Jan 2022
Energy and AI | VOL. 8

A Bilingual Kazakh-Russian System for Automatic Speech Recognition and Synthesis
Olga Khomitsevich ... Valentin Mendelev
-
Olga Khomitsevich, et. al.Olga Khomitsevich ... Valentin Mendelev
01 Jan 2015
01 Jan 2015

Biometrics recognition using deep learning: a survey
Shervin Minaee ... Hang Su
Artificial Intelligence Review | VOL. 56
Shervin Minaee, et. al.Shervin Minaee ... Hang Su
13 Jan 2023
Artificial Intelligence Review | VOL. 56

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Korean Prosody Phrase Boundary Prediction Model for Speech Synthesis Service in Smart Healthcare

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics