Cross-language Retrieval Research Articles

The Text REtrieval Conference (TREC), a yearly workshop hosted by the US government's National Institute of Standards and Technology, provides the infrastructure necessary for large-scale evaluation of text retrieval methodologies. With the goal of accelerating research in this area, TREC created the first large test collections of full-text documents and standardized retrieval evaluation. The impact has been significant; since TREC's beginning in 1992, retrieval effectiveness has approximately doubled. TREC has built a variety of large test collections, including collections for such specialized retrieval tasks as cross-language retrieval and retrieval of speech. Moreover, TREC has accelerated the transfer of research ideas into commercial systems, as demonstrated in the number of retrieval techniques developed in TREC that are now used in Web search engines.This book provides a comprehensive review of TREC research, summarizing the variety of TREC results, documenting the best practices in experimental information retrieval, and suggesting areas for further research. The first part of the book describes TREC's history, test collections, and retrieval methodology. Next, the book provides track reports -- describing the evaluations of specific tasks, including routing and filtering, interactive retrieval, and retrieving noisy text. The final part of the book offers perspectives on TREC from such participants as Microsoft Research, University of Massachusetts, Cornell University, University of Waterloo, City University of New York, and IBM. The book will be of interest to researchers in information retrieval and related technologies, including natural language processing.

Read full abstract

At the NTCIR-4 workshop, Justsystem Corporation (JSC) and Clairvoyance Corporation (CC) collaborated in the cross-language retrieval task (CLIR). Our goal was to evaluate the performance and robustness of our recently developed commercial-grade CLIR systems for English and Asian languages. The main contribution of this article is the investigation of different strategies, their interactions in both monolingual and bilingual retrieval tasks, and their respective contributions to operational retrieval systems in the context of NTCIR-4. We report results of Japanese and English monolingual retrieval and results of Japanese-to-English bilingual retrieval. In monolingual retrieval analysis, we examine two special properties of the NTCIR experimental design (two levels of relevance and identical queries in multiple languages) and explore how they interact with strategies of our retrieval system, including pseudo-relevance feedback, multi-word term down-weighting, and term weight merging strategies. Our analysis shows that the choice of language (English or Japanese) does not have a significant impact on retrieval performance. Query expansion is slightly more effective with relaxed judgments than with rigid judgments. For better retrieval performance, weights of multi-word terms should be lowered. In the bilingual retrieval analysis, we aim to identify robust strategies that are effective when used alone and when used in combination with other strategies. We examine cross-lingual specific strategies such as translation disambiguation and translation structuring, as well as general strategies such as pseudo-relevance feedback and multi-word term down-weighting. For shorter title topics, pseudo-relevance feedback is a major performance enhancer, but translation structuring affects retrieval performance negatively when used alone or in combination with other strategies. All experimented strategies improve retrieval performance for the longer description topics, with pseudo-relevance feedback and translation structuring as the major contributors.

Read full abstract

Cross-language Retrieval Research Articles

Related Topics

Articles published on Cross-language Retrieval

Biomedical information retrieval across languages

The CLEF 2005 Automatic Medical Image Annotation Task

Integrating textual and visual information for cross-language image retrieval: A trans-media dictionary approach

Automatic lexeme acquisition for a multilingual medical subword thesaurus

Combination Approaches in Korean Information Retrieval: Words vs. n-grams, and Query Translation vs. Document Translation

English-Arabic Cross-Language Information Retrieval Based on Parallel Documents

Towards effective strategies for monolingual and bilingual information retrieval

MorphoSaurus

Translation events in cross-language information retrieval

Anchor text mining for translation of Web queries

Mandarin–English Information (MEI): investigating translingual speech retrieval

Applying query structuring in cross-language retrieval

Semantic annotation for concept-based cross-language medical information retrieval

Evaluating Chinese Text Retrieval with Multilingual Queries

웹 이용자의 검색엔진 활용 및 탐색 행위와 성향 분석

Exploiting the LDC Chinese-English Bilingual Wordlist for Cross Language Information Retrieval

CROSS-LANGUAGE TEXT RETRIEVAL BY QUERY TRANSLATION USING TERM REWEIGHTING

SIGIR workshop on interactive retrieval at TREC and beyond

Using eurowordnet in a concept-based approach to cross-language text retrieval

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Cross-language Retrieval Research Articles

Related Topics

Articles published on Cross-language Retrieval

Biomedical information retrieval across languages

The CLEF 2005 Automatic Medical Image Annotation Task

Integrating textual and visual information for cross-language image retrieval: A trans-media dictionary approach

Automatic lexeme acquisition for a multilingual medical subword thesaurus

Combination Approaches in Korean Information Retrieval: Words vs. n-grams, and Query Translation vs. Document Translation

English-Arabic Cross-Language Information Retrieval Based on Parallel Documents

Towards effective strategies for monolingual and bilingual information retrieval

MorphoSaurus

Translation events in cross-language information retrieval

Anchor text mining for translation of Web queries

Mandarin–English Information (MEI): investigating translingual speech retrieval

Applying query structuring in cross-language retrieval

Semantic annotation for concept-based cross-language medical information retrieval

Evaluating Chinese Text Retrieval with Multilingual Queries

웹 이용자의 검색엔진 활용 및 탐색 행위와 성향 분석

Exploiting the LDC Chinese-English Bilingual Wordlist for Cross Language Information Retrieval

CROSS-LANGUAGE TEXT RETRIEVAL BY QUERY TRANSLATION USING TERM REWEIGHTING

SIGIR workshop on interactive retrieval at TREC and beyond

Using eurowordnet in a concept-based approach to cross-language text retrieval