Abstract

The EtsaTrans machine translation system has been in development at the University of the Free State for the last four years and is currently the only machine translation system being developed in South Africa for specialised and nongeneral translation needs. The purpose of this exposition is to present the program through its phases of development, and to report on current levels of performance. We analyse the output, the size of the database, and then propose the future implementation of a part of speech tagger and word stemmer into the program to improve its linguistic performance. Our goal with the system is not to translate all types of document, but to work in a specialised domain that will allow the system to translate documents that are repetitive in nature. This will enable translators to spend more time on non-repetitive subject matter. By capturing the nature of the language of such repetitive documents in the database, we are able to create a standardised language usage for the specialised domain.

Highlights

  • Die EtsaTrans-masjienvertalingstelsel word die afgelope vier jaar reeds aan die Universiteit van die Vrystaat ontwikkel

  • The University of the Free State took over the rights of the LEXICA system from the company EPI-USE Systems in 2000

  • An evaluation done on the system showed that continuing with the development of a purely rule-based machine translation (RBMT) system would be futile in terms of the latest developments within machine translation

Read more

Summary

Historical background

The University of the Free State took over the rights of the LEXICA system from the company EPI-USE Systems in 2000. Sumita and Iida (1999) state that conventional machine translation systems use rules as knowledge, and that it is difficult to build a practical system because of the problem of building such a large-scale rule-base. An evaluation done on the system showed that continuing with the development of a purely rule-based machine translation (RBMT) system would be futile in terms of the latest developments within machine translation (see Snyman & Naudé, 2003). “The RBMT [technique] is associated with systems that rely on different linguistic levels of rules for translation between the source and target language.” (Dorr et al, 1998:25.). Having done some EBMT development, the team soon realised that linguistic knowledge and its applications are essential

EtsaTrans developmental aspects
EtsaTrans at work: translating in an administrative domain
Building a database
Testing EtsaTrans
Test 1
Test 2
Database size
Findings
Conclusions and summary
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call