Abstract
Since the Web consists of documents in various domains or genres, methods used for cross-language information retrieval (CLIR) of Web documents should be independent of a particular domain. In this paper, we propose a CLIR method that uses Web directories that are available in multiple language versions (such as Yahoo). In the proposed method, feature terms are first extracted from Web documents for each category in the source and target languages. Then, one or more corresponding categories in the other language are determined beforehand by comparing similarities between categories across languages. Using these category pairs, we can resolve ambiguities in simple dictionary translations. In this paper, we propose a query disambiguation method for CLIR using Web directories. To assess the effectiveness of our method, we tested the proposed retrieval methods experimentally using English and Japanese versions of Yahoo. The results showed that the proposed method is more effective for CLIR than the previous method
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.