Abstract

Relations between four acoustic paramenters, fundamental frequency (F0) and the center frequencies of the first three formants (F1, F2, and F3), and the perception of vowels are described. Prediction of listeners’ identifications of vowels are best when acoustic trajectories are based on all four parameters. These parameters can be taken separately to form a four-dimensional space or they can be combined to form a three-dimensional space such as Miller’s Auditory Perceptual Space (APS). The time-normalized paths through such spaces correlate best with listener responses. Temporal factors such as durations and speeds along these paths, within limits, are not critical. However, the direction of movement along the path can be crucial. While movement in a forward direction usually evokes the perception of the intended vowel, the opposite movement may sometimes evoke the perception of another vowel. Recent work shows that neural networks, trained with inputs based on F0, F1, F2, and F3, perform very similarly to humans listening to the waveforms of the isolated nuclei. These results will be reviewed and their implications for models of vowel perception will be discussed. [Work supported by NIDCD, AFOSR, and CID.]

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.