Abstract

Twelve Dutch vowels, each pronounced by 50 male speakers, were analyzed in 18 filter bands comparable in bandwidth with the ear's critical band. By considering the sound levels (in decibels) in these filter bands as dimensions, with a principal-component analysis the 18 dimensions per sound were reduced to four factors which together explain 75% of the total variance. The configuration of the average vowels in the factor space appeared to be highly correlated with their configuration in the F1−F2 formant plane. After matching to maximal congruence, correlation coefficients along corresponding axes were 0.997 and 0.979. Machine vowel identification, based upon the position of the individual vowels in the four-dimensional factor space, resulted (after three pairs of related vowels were grouped together) in 98% correct identifications if correction was applied for personal timbre of the speakers' voices. Ten listeners, to whom the 600 vowels were presented as 100-msec segments, gave 86% correct responses in identifying the intended vowels. The confusions between the vowel types were basis for a multidimensional scaling (Kruskal) to construct a perceptual configuration of the vowels. In four dimensions the solution showed 2.3% stress. Perceptual configuration and factor configuration, maximally matched, had correlation coefficients along corresponding axes of 0.997, 0.995, 0.907, and 0.794, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.