Abstract

Previous experiments examining the effects of frequency shifts on vowel perception show that identification accuracy drops when the spectrum envelope is shifted up by more than about 150%, or shifted down by factors smaller than about 60% relative to adult male ranges. Such shifts produce formant patterns near the extreme limits found in human voices. But these effects interact with fundamental frequency (F0): in some conditions identification accuracy is improved by shifting the formant frequencies (FFs) and F0 in the same direction, compared to conditions where one is raised and the other is lowered. The results indicate the presence of perceptual mechanisms that are sensitive to the natural covariation of F0 and FFs in human voices. Initial modeling shows that including F0 and FFs predicts listeners’ behavior better than FFs alone. Specifically, posterior probabilities from linear discriminant function analysis are better correlated with listeners’ identification rates when F0 is included than when it is not. Further modeling suggests prediction of overall perceptual results generally improves (especially in mismatched conditions) for modified models that include a positive correlation between F0 and FF that is somewhat weaker than that observed in natural speech databases. [Work supported by NSF and SSHRC].

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call