A semi-automatic method for extracting vocal tract movements from X-ray films

Julie Fontecave Jallon,Frédéric Berthommier

doi:10.1016/j.specom.2008.06.005

Julie Fontecave Jallon, Frédéric Berthommier

Open Access

https://doi.org/10.1016/j.specom.2008.06.005

Copy DOI

Abstract

Despite the development of new imaging techniques, existing X-ray data remain an appropriate tool to study speech production phenomena. However, to exploit these images, the shapes of the vocal tract articulators must first be extracted. This task, usually manually realized, is long and laborious. This paper describes a semi-automatic technique for facilitating the extraction of vocal tract contours from complete sequences of large existing cineradiographic databases in the context of continuous speech production. The proposed method efficiently combines the human expertise required for marking a small number of key images and an automatic indexing of the video data to infer dynamic 2D data. Manually acquired geometrical data are associated to each image of the sequence via a similarity measure based on the low-frequency Discrete Cosine Transform (DCT) components of the images. Moreover, to reduce the reconstruction error and improve the geometrical contour estimation, we perform post-processing treatments, such as a neighborhood averaging and a temporal filtering. The method is applied independently for each articulator (tongue, velum, lips, and mandible). Then the acquired contours are combined to reconstruct the movements of the entire vocal tract. We carry out evaluations, including comparisons with manual markings and with another semi-automatic method.

Full Text