Abstract

An isolated phoneme recognition system is proposed using time-frequency domain analysis and support vector machines (SVMs). The TIMIT corpus which contains a total of 6300 sentences, ten sentences spoken by each of 630 speakers from eight major dialect regions of the United States, was used in this experiment. Provided time-aligned phonetic transcription was used to extract phonemes from speech samples. A 55-output classifier system was designed corresponding to 55 classes of phonemes and trained with the kernel learning algorithms. The training dataset was extracted from clean training samples. A portion of the database, i.e., 65<th>338 samples of training dataset, was used to train the system. The performance of the system on the training dataset was 76.4%. The whole test dataset of the TIMIT corpus was used to test the generalization of the system. All samples, i.e., 55<th>655 samples of the test dataset, were used to test the system. The performance of the system on the test dataset was 45.3%. This approach is currently under development to extend the algorithm for continuous phoneme recognition. [Work supported in part by grants from DARPA, NASA, and ONR.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.