Abstract
A comprehensive investigation of two acoustic feature sets for English stop consonants spoken in syllable initial position was conducted to determine the relative invariance of the features that cue place and voicing. The features evaluated were overall spectral shape, encoded as the cosine transform coefficients of the nonlinearly scaled amplitude spectrum, and formants. In addition, features were computed both for the static case, i.e., from one 25-ms frame starting at the burst, and for the dynamic case, i.e., as parameter trajectories over several frames of speech data. All features were evaluated with speaker-independent automatic classification experiments using the data from 15 speakers to train the classifier and the data from 15 different speakers for testing. The primary conclusions from these experiments, as measured via automatic recognition rates, are as follows: (1) spectral shape features are superior to both formants, and formants plus amplitudes; (2) features extracted from the dynamic spectrum are superior to features extracted from the static spectrum; and (3) features extracted from the speech signal beginning with the burst onset are superior to features extracted from the speech signal beginning with the vowel transition. Dynamic features extracted from the smoothed spectra over a 60-ms interval timed to begin with the burst onset appear to account for the primary vowel context effects. Automatic recognition results for the 6 stops (93.7%) based on 20 features was better than the rates obtained with human listeners for a 50-ms segment (89.9%) and only slightly worse than the rates obtained by human listeners for a 100-ms interval (96.6%). Thus the basic conclusion from our work is that dynamic spectral shape features are acoustically invariant cues for both place and voicing in initial stop consonants.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.