Abstract

Cross lingual Information Retrieval (CLIR) refers to the information retrieval activities in which the query and/or documents may appear in different languages. Dictionary-based query translation has been a common method in CLIR systems. In these methods we face with the problem of translation ambiguity in which a single word in one language has more than one translation in the other language. In this paper we propose a hybrid approach to retrieve English documents relevant to Persian queries. In this approach we exploit a combination of phrase reorganization, pattern based phrase translation and query expansion before and after translation to improve the dictionary-based query translation. We also propose an improved probabilistic algorithm to choose the best translation of words and phrases. Finally, the documents will be ranked according to statistical language model with some translation steps. Our experimental results show that each of the mentioned methods can bring significant improvement over simple dictionary approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.