Abstract

Cross-Lingual Information Retrieval (CLIR) enables a user to query to the different language target documents. CLIR incorporates a Machine Translation (MT) technique which is in growing state for Indian languages due to the unavailability of enough resources. In this paper, a Statistical Machine Translation (SMT) system is trained on two parallel corpora separately. A large English language corpus is used for language modeling in SMT. Experiments are evaluated by using BLEU score, further, these experimental setups are used to translate the Hindi language queries for the experimental analysis of Hindi-English CLIR. Since SMTdoes not deal with morphological variants while the proposed Translation Induction Algorithm (TIA) deal swith that, therefore, TIA out performs the SMT systemsin perspective of CLIR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call