Abstract

An important problem for speech science is the relationship between syntactic information, prosodic information, and fundamental frequency contour. To facilitate the study of the interaction among these three factors, all three have been coordinated in a continuous speech database. Specifications of the database are as follows. (1) Speech samples consist of 503 phoneme‐balanced Japanese sentences spoken by a male professional announcer [Kuwabara et al., ICASSP '89, 560–563 (1989)]. (2) Phonetic transcriptions at several levels of detail are provided [Takeda et al., Euro. Conf. Speech Technol. 2, 13–16 (1987)]. (3) Fundamental frequency is automatically extracted every 2.5 ms and extraction errors are corrected by hand. (4) The corresponding sentence is decomposed into constituent words and morphemes with lexical information such as inflectional categories and is assigned a tree structure. This information is semiautomatically generated from input texts. (5) Each utterance is segmented into minor phrases and each accent position is marked by listening to each utterance. This fundamental frequency database has been used to quantify fundamental frequency control factors and to show the effectiveness of this information.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call