Abstract

A controlled study involving a multi-feature acoustic analysis of intervocalic English plosives has been conducted for the purpose of evaluating acoustic stop features for automatic stop recognition. Both time-domain and frequency-domain features were measured via computer interaction. The resulting discrete density functions allowed for a comparative evaluation of individual (and in some instance conditional or combined) feature contributions to stop identification. This evaluation consisted of the maximum a posteriori probability of a correct identification of either the voicing mode (voiced or voiceless) or the place of articulation (labial, alveolar, or palato-velar) of a stop independent of the talker or of a limited phonetic context. These features included variations on ten acoustic features for voicing mode identification and seven for place-of-articulation identification. The data base for this study consisted of five male and five female talkers producing all permutations of the stops C epsilon /p,t,k,b,d,g/ and vowels V epsilon /i,e,a,u/ in the sentence frame "please say /h 'CVt/ again." While the results indicate that the acoustic features of burst/formant-onset frequency and voice onset time (VOT) are the two single most salient features for stop identification, several additional features are also shown to provide important redundancy for identification. The often overlooked time-domain features of "double-burst release" and "voicing during stop closure" were found to be particularly useful in providing such redundancy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call