Abstract

In recent times, The Internet has become a huge information resource which contains information in multiple languages. Users are not acquainted with all languages and this language diversity becomes a great barrier for world communication. Cross-Language Information Retrieval (CLIR) provides a solution for this language barrier where a user can search the required information in his regional language. In this paper, a CLIR system is proposed based on Parallel Corpus (PC). A set of parallel sentences are extracted from PC which are based on query words. Term frequency matrix and cosine similarity measure are used for identifying target language translation. The proposed Term Frequency Method (TFM) approach is compared with Probabilistic Lexicon Method (PLM) approach and result analysis shows that proposed TFM approach performs better than the PLM approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call