Abstract
Vowel space data (A two dimensional F1/F2 plot) is of interest to phoneticians for the purpose of comparing different accents, languages, speaker styles and individual speakers. Current automatic methods used by speech technologists do not generally produce traditional vowel space models (See [6] for an overview); instead they tend to produce hyper dimensional code books covering the entire speakers speech stream. This makes it difficult to relate results generated by these methods to observations in laboratory phonetics. In order to address these problems a model was developed based on a mixture Gaussian density function fitted using expectation maximisation on F1/F2 data producing a probability distribution in F1/F2 space. Speech was pre-processed using voicing to automatically excerpt vowel data without any need for segmentation and a parametric fit algorithm [7] was applied to calculate likely vowel targets. The result was a clear visualisation of a speaker’s vowel space requiring no segmented or labelled speech.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.