Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval

Jesús Vilares,Miguel A Alonso,Yerai Doval,Manuel Vilares

doi:10.1016/j.ipm.2015.12.010

Abstract

In contrast with their monolingual counterparts, little attention has been paid to the effects that misspelled queries have on the performance of Cross-Language Information Retrieval (CLIR) systems. The present work makes a first attempt to fill this gap by extending our previous work on monolingual retrieval in order to study the impact that the progressive addition of misspellings to input queries has, this time, on the output of CLIR systems. Two approaches for dealing with this problem are analyzed in this paper. Firstly, the use of automatic spelling correction techniques for which, in turn, we consider two algorithms: the first one for the correction of isolated words and the second one for a correction based on the linguistic context of the misspelled word. The second approach to be studied is the use of character n-grams both as index terms and translation units, seeking to take advantage of their inherent robustness and language-independence. All these approaches have been tested on a from-Spanish-to-English CLIR system, that is, Spanish queries on English documents. Real, user-generated spelling errors have been used under a methodology that allows us to study the effectiveness of the different approaches to be tested and their behavior when confronted with different error rates. The results obtained show the great sensitiveness of classic word-based approaches to misspelled queries, although spelling correction techniques can mitigate such negative effects. On the other hand, the use of character n-grams provides great robustness against misspellings.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information Processing & Management	Publication Date: Jan 12, 2016
Citations: 40	License type: other-oa

R Discovery Prime

R Discovery Prime

Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval

Abstract

Talk to us

Similar Papers

More From: Information Processing & Management

Lead the way for us

Similar Papers

Query translation by text categorization
Patrick Ruch
-
Patrick RuchPatrick Ruch
01 Jan 2004
01 Jan 2004

Prediction of performance of cross-language information retrieval using automatic evaluation of translation
Kazuaki Kishida
Library and Information Science Research | VOL. 30
Kazuaki KishidaKazuaki Kishida
17 Apr 2008
Library and Information Science Research | VOL. 30

Speech and text query based Tamil - English Cross Language Information Retrieval system
P Iswarya ... V Radha
-
P Iswarya, et. al.P Iswarya ... V Radha
01 Jan 2014
01 Jan 2014

“They Are Out There, If You Know Where to Look”: Mining Transliterations of OOV Query Terms for Cross-Language Information Retrieval
Raghavendra Udupa ... Anton Bakalov
-
Raghavendra Udupa, et. al.Raghavendra Udupa ... Anton Bakalov
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval

Abstract

Talk to us

Similar Papers

More From: Information Processing &amp; Management

More From: Information Processing & Management