Abstract
The authors describe a fast training algorithm for feedforward neural nets, and apply it to a two-layer neural network to classify segments of speech as voiced, unvoiced, or silence. The speech classification method is based on features computed for each speech segment and used as input to the network. The network weights are trained using a fast training algorithm which uses a quasi-Newton error minimization method with a positive-definite approximation of the Hessian matrix. When used for voiced-unvoiced-silence classification of speech frames, the performance of the network compares favorably with that of current approaches. Experimental results are presented for speaker-dependent speech classification, including evaluation of the effects of the type of input data used during training. The results indicate satisfactory performance with errors in the range 3-5%, based on manual classification of the speech frames.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.