Abstract

The performance of the phonotactic system for language recognition depends on the quality of the phone recognizers. To improve the performance of the recognizers, this paper investigates the use of new acoustic features and discriminative training techniques for phone recognizers. The commonly used features are static ceptral coefficients appended with their first and second order deltas. This configuration may be not optimal for phone recognition in phonotactic language recognition systems. In this paper, a time-frequency cepstral (TFC) feature is proposed based on our previous work in acoustic language recognition systems. The feature is extracted as follows: first a temporal discrete cosine transform (DCT) is carried out on the cepstrum matrix, and then select the transformed elements in a specific area using the variance maximization criterion. Different parameters are tested to obtain the optimal configuration. Also, we adopt the feature minimum phone error (fMPE) method for discriminative training of phone models to obtain better phone recognition results for further improvement. The effectiveness of the two techniques is demonstrated on the NIST Language Recognition Evaluation (LRE) 2007database, including the 30 second, 10 second and 3 second closed-set test conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.