Improving Retrieval performance of English-Hindi based Cross-Language Information Retrieval

Saurabh Varshney,Jyoti Bajpai

doi:10.1109/mite.2013.6756354

Abstract

The hurdle problem in Cross Language Information Retrieval (CLIR) is the poor performance when compared to monolingual performance in terms of average precision. The main reasons behind the poor performance of CLIR are query term mismatching, multiple representations of query terms and un-translated query terms. In this paper, we are putting our effort to solve the given problem which is discussed in detail. The limitations are needed to be addressed in order to increase the performance of the CLIR system. By analyzing those methods the architecture for English-Hindi CLIR system is proposed. Pre and post query expansion is used to improve the performance of English-Hindi CLIR system using English and Hindi WordNet, Local Expansion using initial query, definition based pre query expansion and keyword ranking. The pre and post query expansion helps to improving the performance of English-Hindi CLIR system and based upon past experiences the proposed approach retrieves more relevant information. All experiments are performed on FIRE 2010 (Forum of Information Retrieval Evaluation) datasets. The experimental results show that the proposed approach gives equal/better performance of English-Hindi CLIR system compared to monolingual performance and also helps in overcoming existing problems and outperforms the existing English-Hindi CLIR system in terms of average precision.

Full Text