Abstract
This study applies a semi-supervised graph-based dimensionality reduction algorithm (Laplacian Eigenmaps [Belkin & Nyogi, 2002]) to analyze burst spectra from adult productions of English /k/ and /t/. Multitaper spectra calculated over 25-ms windows were passed through a gammatone filter bank, which models the auditory periphery’s frequency selectivity and frequency-scale compression. From these psychoacoustic spectra, a graph was constructed: node pairs (two spectra) were connected if they shared a common talker or target word, and connecting edges were weighted by the symmetric Kullback-Leibler divergence between the spectra. This graph’s eigenvectors map the spectra into a low-dimensional feature space. Our preliminary experiments with 512 tokens produced by 16 talkers suggest that this algorithm is able to learn a two-dimensional representation of the bursts which reflects well-established articulatory constriction features. The first dimension linearly separated /k/ from /t/ in the back vowel environment, reflecting posterior versus anterior constriction place; the second dimension linearly separated /k/ from /t/ before front vowels, reflecting apical versus dorsal lingual articulator. Experiments are underway to test how well the algorithm generalizes from the training set to handle unseen productions both from the same talkers and from 5 novel talkers.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.