Fundamental frequency database with linguistic and phonetic information

Masanobu Abe,Hisao Kuwabara,Yoshinori Sagisaka

doi:10.1121/1.2027479

Abstract

An important problem for speech science is the relationship between syntactic information, prosodic information, and fundamental frequency contour. To facilitate the study of the interaction among these three factors, all three have been coordinated in a continuous speech database. Specifications of the database are as follows. (1) Speech samples consist of 503 phoneme‐balanced Japanese sentences spoken by a male professional announcer [Kuwabara et al., ICASSP '89, 560–563 (1989)]. (2) Phonetic transcriptions at several levels of detail are provided [Takeda et al., Euro. Conf. Speech Technol. 2, 13–16 (1987)]. (3) Fundamental frequency is automatically extracted every 2.5 ms and extraction errors are corrected by hand. (4) The corresponding sentence is decomposed into constituent words and morphemes with lexical information such as inflectional categories and is assigned a tree structure. This information is semiautomatically generated from input texts. (5) Each utterance is segmented into minor phrases and each accent position is marked by listening to each utterance. This fundamental frequency database has been used to quantify fundamental frequency control factors and to show the effectiveness of this information.

Full Text