Abstract

This work is devoted to the study of the properties of the sound spectrum at the release of Italian stop consonants in vocalic contexts. The aim is to check if the amplitudes of the peaks in the spectrum can be used as acoustic attributes of the place of articulation of the consonants. This information is useful for defining an automatic algorithm which can discriminate among different place of articulation using simple data such as the values, in dB, of the maximum peaks in different frequency ranges. Moreover, different measurements have been performed (the spectra are computed at the release, averaged over 10 msec after the release, and using a smoothed spectrum) in order to define which measure retains more information about peak amplitudes. Materials and procedures The recording and measurements were made at the Research Laboratory of Electronics, Speech Communication Group, MIT, Cambridge, USA. The materials consisted in VCVC utterances produced by seven adult Italian speakers (three females and four males) in a sound-treated room and recorded on a high-quality magnetic tape recording system. The utterances were embedded in a carrier phrase. The measurements were made for the intervocalic consonant. Data were collected for all Italian vowels embedded in stop contexts. However, the results reported in the present paper are derived from the analysis of the stop consonants in the [a] context. The spectral representations used Supported by IIASS, CNR, and INFM Salerno University. Acknoledgements goes to M. Grabriella Di Benedetto for her useful comments and suggestions include a DFT spectrum, a smoothed DFT, a spectral averaging. The analysis window (Hamming window) was set to 3.1 msec. The spectrum at the consonant release, the averaged spectrum over the first 4 msec (for [b, d, g]) and over 10 msec (for [p, t, k]) after the release and, the k-averaged2 spectrum were computed using a software program developed by Klatt (1984). All spectra were preemphasized, and the spectral amplitudes were enhanced by modifying an overall spectral gain control parameter. The amplitudes of the maximum peaks in different frequency ranges were measured by visual examination. The amplitude attributes The peaks amplitudes measured in the different frequency ranges described above were compared in order to identify properties that can be useful to discriminate the place of articulation of each consonant. Initially averages of the maximum peak amplitudes in different frequency ranges were computed. However, even though some of these averages differ significantly from one consonant to another, the standard deviations were high and they overlapped. This effect is mostly due to the variability of the peak amplitudes among the speakers. For this reason we decided to exclude these measures and we start to look to the amplitudes of the maximum peaks in specified frequency ranges compared to the amplitudes of the maximum peaks in other The k-averaged spectrum was computed by measuring the VOT length of the voiceless consonant. The cursor was then placed on the waveform at the temporal sampling point corresponding to half the VOT length, and the spectrum was averaged over 5 msec to the left and 5 msec to the right of this sam-

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.