Abstract

A detailed understanding of how the acoustic patterns of speech sounds are generated by the complex 3D shapes of the vocal tract is a major goal in speech research. The Dresden Vocal Tract Dataset (DVTD) presented here contains geometric and (aero)acoustic data of the vocal tract of 22 German speech sounds (16 vowels, 5 fricatives, 1 lateral), each from one male and one female speaker. The data include the 3D Magnetic Resonance Imaging data of the vocal tracts, the corresponding 3D-printable and finite-element models, and their simulated and measured acoustic and aerodynamic properties. The dataset was evaluated in terms of the plausibility and the similarity of the resonance frequencies determined by the acoustic simulations and measurements, and in terms of the human identification rate of the vowels and fricatives synthesized by the artificially excited 3D-printed vocal tract models. According to both the acoustic and perceptual metrics, most models are accurate representations of the intended speech sounds and can be readily used for research and education.

Highlights

  • Background & SummaryRecently, Magnetic Resonance Imaging (MRI) has become an important tool for speech research

  • Vocal tract shapes of sustained speech sounds were acquired from two native German speakers, one male and one female

  • The sound pressure level (SPL) were calculated from the audio signal x(k) as www.nature.com/scientificdata display_data.m display_data.py bronchial_horn.stl trachea.stl combined_plots.pdf definitions.py fem_run.py model_names.py modules.py to_h5_model.py subject-1/ s1-mandible.stl s1-maxilla.stl s1-01-bahn-tense-a/

Read more

Summary

Introduction

Background & SummaryRecently, Magnetic Resonance Imaging (MRI) has become an important tool for speech research. This dataset contains triangle meshes of the inner vocal tract surfaces extracted from the MRI data as STL files. We present a dataset containing 3D vocal tract images of 22 German speech sounds (16 vowels and 6 consonants), each from one male and one female speaker.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call