Phonetically Balanced Text Corpus Design Using a Similarity Measure for a Stereo Super-Wideband Speech Database

Yoo Rhee Oh,Mina Kim,Mi Suk Lee,Yong Guk Kim,Hong Kook Kim,Hyun Joo Bae

doi:10.1587/transinf.e94.d.1459

Abstract

In this paper, we propose a text corpus design method for a Korean stereo super-wideband speech database. Since a small-sized text corpus for speech coding is generally required for speech coding, the corpus should be designed to comply with the pronunciation behavior of natural conversation in order to ensure efficient speech quality tests. To this end, the proposed design method utilizes a similarity measure between the phoneme distribution occurring from natural conversation and that from the designed text corpus. In order to achieve this goal, we first collect and refine text data from textbooks and websites. Next, a corpus is designed from the refined text data based on the similarity measure to compare phoneme distributions. We then construct a Korean stereo super-wideband speech (K-SW) database using the designed text corpus, where the recording environment is set to meet the conditions defined by ITU-T. Finally, the subjective quality of the K-SW database is evaluated using an ITU-T super-wideband codec in order to demonstrate that the K-SW database is useful for developing and evaluating super-wideband codecs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEICE Transactions on Information and Systems	Publication Date: Jan 1, 2011
Citations: 2	License type: free

R Discovery Prime

R Discovery Prime

Phonetically Balanced Text Corpus Design Using a Similarity Measure for a Stereo Super-Wideband Speech Database

Abstract

Talk to us

Similar Papers

More From: IEICE Transactions on Information and Systems

Lead the way for us

Similar Papers

Development of Text and Speech Corpus for Designing the Multilingual Recognition System
Shweta Bansal ... Shyam S Agrawal
-
Shweta Bansal, et. al.Shweta Bansal ... Shyam S Agrawal
01 May 2018
01 May 2018

The new Mainz speech test for children aged 3-7years (MATCH) : Design, standardization, and validation. German version
V Schirkonyer ... A Keilmann
HNO | VOL. 68
V Schirkonyer, et. al.V Schirkonyer ... A Keilmann
28 Nov 2019
HNO | VOL. 68

What makes a musical improvisation creative
...
-
, et. al. ...
31 Aug 2011
31 Aug 2011

Romanian language statistics and resources for text-to-speech systems
Adriana Stan ... Mircea Giurgiu
-
Adriana Stan, et. al.Adriana Stan ... Mircea Giurgiu
01 Nov 2010
01 Nov 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phonetically Balanced Text Corpus Design Using a Similarity Measure for a Stereo Super-Wideband Speech Database

Abstract

Talk to us

Similar Papers

More From: IEICE Transactions on Information and Systems