Connecting Intonation Labels to Mathematical Descriptions of Fundamental Frequency

Esther Grabe,John Coleman,Greg Kochanski

doi:10.1177/00238309070500030101

Abstract

The mathematical models of intonation used in speech technology are often inaccessible to linguists. By the same token, phonological descriptions of intonation are rarely used by speech technologists, as they cannot be implemented directly in applications. Consequently, these research communities do not benefit much from each other's insights. In this paper, we explore the interface between the disciplines, in search of bridges between intonational phonology and speech technology. In a corpus of speech data from seven dialects of English, we hand-labeled over 700 sentences and identified seven nuclear accent types. Then we fitted a third-order polynomial to the fundamental frequency (F0) contour in the region around the accent mark. The polynomial captures the local shape (time-dependence) of F0 in a few numbers, in our case, four coefficients. The coefficients were subjected to statistical analysis. Nineteen of the 21 pairs of accent types differed significantly in one or more coefficients. Our approach bridges the gap between intonational phonology and speech technology. It provides quantitative, empirically testable models of intonation labels that can be implemented in applications.

Full Text