Abstract
A biologically plausible neural network model that employs unsupervised learning was applied to various sets of CV syllables. This network has been shown to develop recognition of input signals on the basis of distinctive signal features rather than overall signal shape [N. Intrator and B. Seebach, Int. Neural Network Soc. Abstr. 1, Suppl. 1, 299 (1988)]. Syllables pronounced in isolation by male and female speakers were digitized and sampled in short (8–32 ms) overlapping time windows, then filtered into overlapping critical bandwidths [E. Zwicker, J. Acoust. Soc. Am. 33, 248 (1961)] to produce three-dimensional energy surfaces in time and frequency. A portion of these syllabic tokens was used as a training set for the net. Those remaining were used to test generalization of network solutions both within a single speaker's utterances and across speakers. For example, when trained on a single speaker's tokens, and tested for classification of place of articulation in stop consonants, the network might correctly identify approximately 80% of similar tokens from a different speaker.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.