Annotation in the SpeechDat Projects

H Van Den Heuvel ,Eric Sanders,Maurizio Omologo,Asunción Moreno ,L.w.j Boves ,Gaël Richard

doi:10.1023/a:1011375311203

Abstract

A large set of spoken language resources (SLR) for various European languages is being compiled in several SpeechDat projects with the aim to train and test speech recognizers for voice driven services, mainly over telephone lines. This paper is focused on the annotation conventions applied for the Speechdat SLR. These SLR contain typical examples of short monologue speech utterances with simple orthographic transcriptions in a hierarchically simple annotation structure. The annotation conventions and their underlying principles are described and compared to approaches used for related SLR. The synchronization of the orthographic transcriptions with the corresponding speech files is addressed, and the impact of the selected approach for capturing specific phonological and phonetic phenomena is discussed. In the SpeechDat projects a number of tools have been developed to carry out the transcription of the speech. In this paper, a short description of these tools and their properties is provided. For all SpeechDat projects, an internal validity check of the databases and their annotations is carried out. The procedure of this validation campaign, the performed evaluations, and some of the results are presented.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Annotation in the SpeechDat Projects

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology

Lead the way for us

Journal: International Journal of Speech Technology	Publication Date: Jan 1, 2001
Citations: 24

Similar Papers

Syntax–Phonology Interface
E Selkirk
International Encyclopedia of Social & Behavioral Sciences | VOL. -
E SelkirkE Selkirk
01 Jan 2001
International Encyclopedia of Social & Behavioral Sciences | VOL. -

Syntax–Phonology Interface
Elisabeth Selkirk
-
Elisabeth SelkirkElisabeth Selkirk
01 Jan 2015
01 Jan 2015

Form and Function of Aizuchi Japenese Native Speaker in Inaka Ni Tomarou! TV Serial
Iantika Humanjadna Dityandari ... Bayu Aryanto
IZUMI | VOL. 9
Iantika Humanjadna Dityandari, et. al.Iantika Humanjadna Dityandari ... Bayu Aryanto
01 Dec 2020
IZUMI | VOL. 9

Automatic phonetic transcription of large speech corpora
Christophe Van Bael ... Helmer Strik
Computer Speech & Language | VOL. 21
Christophe Van Bael, et. al.Christophe Van Bael ... Helmer Strik
23 Mar 2007
Computer Speech & Language | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Annotation in the SpeechDat Projects

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology