Abstract
Speech timing problems associated with dysarthria often involve the presence of periods of extraneous silence and nonspeech sounds as well as inappropriately timed or misplaced speech gestures. This study evaluated the performance of neural networks in detecting the presence of inappropriate or nonspeech sounds and extraneous silence. The “opt” neural network program [E. Barnard and R. Cole, OGC Tech. Rep. No. CSE 89‐014] that uses a conjugate gradient algorithm to adjust node weights was trained to recognize breaths and silence in a reading of the rainbow passage by a single dysarthric (Cerebral Palsy) talker. Input to the network consisted of a sequence of frames of parameters derived from spectral analysis of the speech. The output was a binary (speech/nonspeech) decision for the segment of signal corresponding to the middle frame of the input sequence. Networks of various size and configuration were trained on half the available data and tested on the remaining data. The best network configurations correctly identified approximately 99% of the frames in the training set and about 97% of the frames in test datasets.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have