Abstract

Learning to Rank (LTR) refers to machine learning techniques for training a model in a ranking task. LTR has been shown to be useful in many applications in information retrieval (IR). Cross language information retrieval (CLIR) is one of the major IR tasks that can potentially benefit from LTR to improve the ranking accuracy. CLIR deals with the problem of expressing query in one language and retrieving the related documents in another language. One of the most important issues in CLIR is how to apply monolingual IR methods in cross lingual environments. In this paper, we propose a new method to exploit LTR for CLIR in which documents are represented as feature vectors. This method provides a mapping based on IR heuristics to employ monolingual IR features in parallel corpus based CLIR. These mapped features are considered as training data for LTR. We show that using LTR trained on mapped features can improve CLIR performance. A comprehensive evaluation on the English-Persian CLIR suggests that our method has significant improvements over parallel corpora based methods and dictionary based methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call