Abstract

Since the Web consists of documents in various domains or genres, methods used for cross-language information retrieval (CLIR) of Web documents should be independent of a particular domain. In this paper, we propose a CLIR method that uses Web directories that are available in multiple language versions (such as Yahoo). In the proposed method, feature terms are first extracted from Web documents for each category in the source and target languages. Then, one or more corresponding categories in the other language are determined beforehand by comparing similarities between categories across languages. Using these category pairs, we can resolve ambiguities in simple dictionary translations. In this paper, we propose a query disambiguation method for CLIR using Web directories. To assess the effectiveness of our method, we tested the proposed retrieval methods experimentally using English and Japanese versions of Yahoo. The results showed that the proposed method is more effective for CLIR than the previous method

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call