Acquisition of large lexicons for practical knowledge-based MT

Deryle Lonsdale,Eric Nyberg,Teruko Mitamura

doi:10.1007/bf00980580

Abstract

Although knowledge-based MT systems have the potential to achieve high translation accuracy, each successful application system requires a large amount of hand-coded lexical knowledge. Systems like KBMT-89 and its descendents have demonstrated how knowledge-based translation can produce good results in technical domains with tractable domain semantics. Nevertheless, the magnitude of the development task for large-scale applications with tens of thousands of domain concepts precludes a purely hand-crafted approach. The current challenge for the “next generation” of knowledge-based MT systems is to utilize on-line textual resources and corpus analysis software in order to automate the most laborious aspects of the knowledge acquisition process. This partial automation can in turn maximize the productivity of human knowledge engineers and help to make large-scale applications of knowledge-based MT an viable approach. In this paper we discuss the corpus-based knowledge acquisition methodology used in KANT, a knowledge-based translation system for multilingual document production. This methodology can be generalized beyond the KANT interlingua approach for use with any system that requires similar kinds of knowledge.

Full Text