Abstract

A three-layered neural network approach for burst point location is presented, which can be used to extract consonant segments in a speaker-independent continuous speech recognition system. By using neural networks trained with the backpropagation algorithm [Ruemlhart et al., Nature 323, 533–536 (1986)], nonlinearity is introduced into the articulatory event detection decision making. The system can detect the burst point location in French voiced stop consonants /b,d,g/. For the experiments, a neural network structure of 12–20 units in the hidden layer, 50 units in the input layer, and 1 unit in the output layer is used. The input patterns represent the time series values of the speech power transition. The network was trained in several steps initially using a smaller set of training data and then using larger sets. The results of the burst location detection were encouraging, especially for the syllables “ba,” “dou,” and “ga.” Generally, the detection rate for /b/ was two times better than that for /d/ and /g/. There was no remarkable difference between the detection rates in the training data and in the unknown data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.