Co-registration of articulographic and real-time magnetic resonance imaging data for multimodal analysis of rapid speech

Jangwon Kim,Adam Lammert,Michael Proctor,Shrikanth Narayanan

doi:10.1121/1.4755722

Abstract

We propose a method for co-registrating speech articulatory/acoustic data from two modalities that provide complementary advantages. Electromagnetic Articulography (EMA) provides high temporal resolution (100 samples/second in WAVE system) and flesh-point tracking, while real-time Magnetic Resonance Imaging, rtMRI, (23 frames/second) offers a complete midsagittal view of the vocal tract, including articulated structures and the articulatory environment. Co-registration was achieved through iterative alignment in the acoustic and articulatory domains. Acoustic signals were aligned temporally using Dynamic Time Warping, while articulatory signals were aligned variously by minimization of mean total error between articulatometry data and estimated corresponding flesh points and by using mutual information derived from articulatory parameters for each sentence. We demonstrate our method on a subset of the TIMIT corpus elicited from a male and a female speaker of American English, and illustrate the benefits of co-registered multi-modal data in the study of liquid and fricative consonant production in rapid speech. [Supported by NIH and NSF grants.]

Full Text