Abstract

This paper presents a comprehensive study of continuous speech recognition in Spanish. It shows the use and optimisation of several well-known techniques together with the application for the first time to Spanish of language specific knowledge to these systems, i.e. the careful selection of the phone inventory, the phone-classes used, and the selection of alternative pronunciation rules. We have developed a semicontinuous phone-class dependent contextual modelling. Using four phone-classes, we have obtained recognition error rate reductions roughly equivalent to the percentage increase of the number of parameters, compared to baseline semicontinuous contextual modelling. We also show that the use of pausing in the training system and multiple pronunciations in the vocabulary help to improve recognition rates significantly. The actual pausing of the training sentences and the application of assimilation effects improve the transcription into context-dependent units. Multiple pronunciation possibilities are generated using general rules that are easily applied to any Spanish vocabulary. With all these ideas we have reduced the recognition errors of the baseline system by more than 30% in a task parallel to DARPA-RM translated into Spanish with a vocabulary of 979 words. Our database contains four speakers with 600 training sentences and 100 testing sentences each. All experiments have been carried out with a perplexity of 979, and even slightly higher in the case of multiple pronunciations, to be able to study the acoustic modelling power of the systems with no grammar constraints.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.