Enhancing cross-language information retrieval by an automatic acquisition of bilingual terminology from comparable corpora

Fatiha Sadat,Masatoshi Yoshikawa,Shunsuke Uemura

doi:10.1145/860435.860519

Enhancing cross-language information retrieval by an automatic acquisition of bilingual terminology from comparable corpora

Fatiha Sadat, Masatoshi Yoshikawa + Show 1 more

https://doi.org/10.1145/860435.860519

Copy DOI

Publication Date: Jul 28, 2003

Citations: 11

Affiliation: Nagoya University, Nara Institute of Science and Technology

#Cross-Language Information Retrieval #Comparable Corpora + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper presents an approach to bilingual lexicon extraction from comparable corpora and evaluations on Cross-Language Information Retrieval. We explore a bi-directional extraction of bilingual terminology primarily from comparable corpora. A combined statistics-based and linguistics-based model to select best translation candidates to phrasal translation is proposed. Evaluations using a large test collection for Japanese-English revealed the proposed combination of bi-directional comparable corpora, bilingual dictionaries and transliteration, augmented with linguistics-based pruning to be highly effective in Cross-Language Information Retrieval.

Full Text