Robust unsupervised extraction of vocal tract variables from midsagittal real-time magnetic resonance image sequences using region segmentation

Erik Bresch,Shrikanth Narayanan

doi:10.1121/1.2942840

Abstract

The tracking of deformable objects in image sequences has been a topic of intensive research for many years, and many application-specific solutions have been proposed. In this work, we describe a method that was developed to robustly track the tissue structures of the human vocal tract in midsagittal, real-time magnetic resonance (MR) images. The goal of the algorithm is to fully automatically extract the vocal tract outline, the position of the articulators, and the tract variables to facilitate the study of the shaping of the vocal tract during speech production. The algorithm is unsupervised and requires only a one-time initialization step for a particular subject. Importantly, the tracking algorithm operates on the spatial frequency domain representation of the underlying images, and it is hence specifically fit to the data produced in the MR imaging process. The proposed method carries out a multiregion segmentation of the individual MR images using an anatomically informed model of the vocal tract whose fit to the observed image data is hierarchically optimized using an anatomically informed gradient descent procedure. The mathematical key components of the algorithm are the closed-form solution of the two-dimensional Fourier transform of a polygonal shape function and the design of alternative gradient descent flows for the iterative solution of an overdetermined nonlinear least squares optimization problem. Various examples of segmented real-time MR images and a summary of open challenges will be presented. [Work supported by NIH.]

Full Text