Alignment and extraction of bilingual legal terminology from context profiles

Oi Yee Kwong,Benjamin K Tsou,Tom B Y Lai

doi:10.1075/term.10.1.05kwo

Abstract

In this study, we propose a method for aligning terms and extracting translations from a small, domain-specific corpus consisting of parallel English and Chinese court judgments from Hong Kong. With a sentence-aligned corpus, translation equivalents are suggested by analysing the frequency profiles of parallel concordances. The method overcomes the limitations of conventional statistical methods which require large corpora to be effective, and those of lexical approaches which depend on existing bilingual dictionaries. Pilot testing on a parallel corpus of about 113K Chinese words and 120K English words gives an encouraging 79% precision and 38% recall on average. The method has its own limitations such as failure to detect multiple candidates and secondary translations, but it provides a good basis for acquiring an initial translation lexicon for legal terminology from indigenous bilingual legal texts.

Full Text