Abstract
Some models of speech perception/production and language acquisition make use of a quasi-continuous representation of the acoustic speech signal. We investigate whether such models could potentially profit from incorporating articulatory information in an analogous fashion. In particular, we investigate how articulatory information represented by EMA measurements can influence unsupervised phonetic speech categorization. By incorporation of the acoustic signal and non-synthetic, raw articulatory data, we present first results of a clustering procedure, which is similarly applied in numerous language acquisition and speech perception models. It is observed that nonlabeled articulatory data, i.e. without previously assumed landmarks, perform fine clustering results. A more effective clustering outcome for plosives than for vowels seems to support the motor view of speech perception. Index Terms: Speech production/perception, modeling, clustering, EMA.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.