Anusaaraka: An expert system based machine translation system

Sriram Chaudhury,Dipti M Sharma,Ankitha Rao

doi:10.1109/nlpke.2010.5587789

Abstract

Most research in Machine translation is about having the computers completely bear the load of translating one human language into another. This paper looks at the machine translation problem afresh and observes that there is a need to share the load between man and machine, distinguish reliable knowledge from the heuristics, provide a spectrum of outputs to serve different strata of people, and finally make use of existing resources instead of reinventing the wheel. This paper describes a unique approach to develop machine translation system based on the insights of information dynamics from Paninian Grammar Formalism. Anusaaraka is a Language Accessor cum Machine Translation system based on the fundamental premise of sharing the load producing good enough results according to the needs of the reader. The system promises to give faithful representation of the translated text, no loss of information while translating and graceful degradation (robustness) in case of failure. The layered output provides an access to all the stages of translation making the whole process transparent. Thus, Anusaaraka differs from the Machine Translation systems in two respects: (1) its commitment to faithfulness and thereby providing a layer of 100% faithful output so that a user with some training can “access the source text” faithfully. (2) The system is so designed that a user can contribute to it and participate in improving its quality. Further Anusaaraka provides an eclectic combination of the Apertium architecture with the forward chaining expert system, allowing use of both the deep parser and shallow parser outputs to analyze the SL text. Existing language resources (parsers, taggers, chunkers) available under GPL are used instead of rewriting it again. Language data and linguistic rules are independent from the core programme, making it easy for linguists to modify and experiment with different language phenomena to improve the system. Users can become contributors by contributing new word sense disambiguation (WSD) rules of the ambiguous words through a web-interface available over internet. The system uses forward chaining of expert system to infer new language facts from the existing language data. It helps to solve the complex behavior of language translation by applying specific knowledge rather than specific technique creating a vast language knowledge base in electronic form. Or in other words, the expert system facilitates the transformation of subject matter expert's (SME) knowledge available with humans into a computer processable knowledge base.

Full Text