Abstract

This work, divided into Part I and II, describes the development of GorUP a Semantic Speech Recognition System in the Basque context. Part I analyses cross-lingual approaches oriented to under-resourced languages and Part II the development of the Language Identification system. During the development, data optimization methods and Soft Computing methodologies oriented to complex environment are used in order to overcome the lack of resources. Moreover, in this context three languages coexist: French, Spanish and Basque. Indeed our main goal is the development of robust Automatic Speech Recognition (ASR) systems for Basque, but all language variability has to be analyzed. In this regard, Basque speakers mix during the speech not only sounds but also words of the three languages which results in a strong presence of cross-lingual elements. Besides, Basque is an agglutinative language with a special morpho-syntactic structure inside the words that may lead to intractable vocabularies. Nowadays, our work is oriented to Information Retrieval and mainly to small internet mass-media. In these cases the available resources for Basque in general, and for this task in particular, are very few and complex to process because of the noisy environment. Thus, the methods employed in this development (ontology-based approach or cross-lingual methodologies oriented to profit from more powerful languages) could suit the requirements of many under-resourced languages.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.