The intermediary language for multilanguage translation

Michael Zarechnak

doi:10.1007/bf00936468

Abstract

This paper discusses some basic notions involved in designing, developing, and implementing the Intermediary Language (IL) for Machine Translation applied to a set of languages. The stages for the design of the IL would include the independent analysis and synthesis of each language in its own terms. Then each could be mapped once into the IL dictionary and grammar, creating the IL text. From the IL text the transfer routine would synthesize the target text for a particular language. It is assumed that the IL text would have algebraic representation of the variables to be instantiated in the target language on the basis of the IL text information. The IL should contain all the information occurring in the set of languages plus such generalizations as might be justified on the basis of inductive implications and/or deductively oriented postulates to be verified by adding new languages for testing the capacity of the IL. Given five languages spoken by more than a hundred million people, if N equals 5 for the pairwise translation (say, into English), N2−N, we get 20 programs, and for the IL translation 2N+1, we can manage with eleven programs, yielding a significant gain. The IL metalanguage, ideally, should have the capacity to function as an algebraic representation of both paradigmatic units (the selection axis) and their relationships (the contiguity axis). Both should be correlated with the extralinguistic fragments in terms of determiners, quantifiers, and classifiers. The structure of the IL grammar contains four components: dictionary, context-free information providing the nonterminal dictionary (i.e., classification), parser/synthesizer, and the initial string.

Full Text