Abstract
Significant error in stop consonant recognition is caused by the confusion between voiced stop consonants and their unvoiced counterparts. The recognition is based on HMM’s which use 12 MFCC’s and energy with their time derivatives. The voicing state is the distinctive feature of homorganic stop consonants. According to recognition error-rate analysis, it seems that the cepstral feature does not accurately represent the voicing state of the modeled phone. For this purpose, a voiced–unvoiced classifier in conjunction with HMM’s is proposed to improve the recognition of stop consonants. The recognition is done in two passes. In the first pass, a phone recognizer uses well-trained HMM’s to identify a stop consonant. This pass provides the recognized stop consonant in addition to the log probability. In the second pass, the voiced–unvoiced classifier checks if the voicing state of the phone segment matches with its phonetic description. In the case of mismatch and low probability of recognition, the voiced (unvoiced) consonant is swapped with its corresponding unvoiced (voiced) counterpart. Recognition results are presented in terms of error rate, using different techniques of voiced–unvoiced classification. This method reduces the recognition error rate of stop consonants. [Research supported by DARPA Contract Nos. DABT63-93-C-0037, N66001-96-C-8510, and NSF Contract No. IRI-9618854.]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.