In Autosegmental-Metrical models of intonational phonology, different types of pitch accents, phrase accents, and boundary tones concatenate to create a set of phonologically distinct phrase-final nuclear tunes. This study asks if an eight-way distinction in nuclear tune shape in American English, predicted from the combination of two (monotonal) pitch accents, two phrase accents, and two boundary tones, is evident in speech production and in speech perception. F0 trajectories from a large-scale imitative speech production experiment were analyzed using bottom-up(k-means) clustering, neural net classification, GAMM modeling, and modeling of turning point alignment. Listeners’ perception of the same tunes is tested in a perceptual discrimination task and related to the imitation results. Emergent grouping of tunes in the clustering analysis, and related classification accuracy from the neural net, show a merging of some of the predicted distinctions among tunes whereby tune shapes that vary primarily in the scaling of final f0 are not reliably distinguished. Within five emergent clusters, subtler distinctions among tunes are evident in GAMMs and f0 turning point modeling. Clustering of individual participants’ production data shows a range of partitions of the data, with nearly all participants making a primary distinction between a class of High-Rising and Non-High-Rising tunes, and with up to four secondary distinctions among the non-Rising class. Perception results show a similar pattern, with poor pairwise discrimination for tunes that differ primarily, but by a small degree, in final f0, and highly accurate discrimination when just one member of a pair is in the High-Rising tune class. Together, the results suggest a hierarchy of distinctiveness among nuclear tunes, with a robust distinction based on holistic tune shape and poorly differentiated distinctions between tunes with the same holistic shape but small differences in final f0. The observed distinctions from clustering, classification, and perception analyses align with the tonal specification of a binary pitch accent contrast {H*, L*} and a maximally ternary {H%, M%, L%} boundary tone contrast; the findings do not support distinct tonal specifications for the phrase accent and boundary tone from the AM model. 
Read full abstract