Abstract

In Cross-Language Information Retrieval (CLIR) process, the translation effects have a direct impact o n the accuracy of follow-up retrieval results. In diction ary-based approach, we are dealing with the words t hat have more than one meaning which can decrease the retrieval performance if the query translation retur n an incorrect translations. These issues need to be ove rcome using efficient technique. In this study we p roposed a Cross-Language Information Retrieval (CLIR) method based on domain ontology using Quran concepts for disambiguating translation of the query and to improve the dictionary-based query translation. For experimentation, we use Quran ontology written in E nglish and Malay languages as a bilingual parallelcorpora and Quran concepts as a resource for cross- language query translation along with dictionary-ba sed translation. For evaluation, we measure the perform ance of three IR systems. IR 1 is natural language query IR, IR 2 is natural language query CLIR based on dictionary (as a Baseline) and IR 3 is the retrieval of this research proposed method using Mean Average Precision (MAP) and average precision at 11 points of recall. The experimental result shows that our prop osed method brings significant improvement in retri eval accuracy for English document collections, but defi cient for Malay document collections. The proposed CLIR method can obtain query expansion effect and improve retrieval performance in certain language.

Highlights

  • Nowadays, the usage of computers and the Internet has grown

  • The English document Mean Average Precision (MAP) result for IR3 is higher than IR2 by 2 and 0.6% and the Malay document MAP result for IR3 is lower than IR2 by 2.5 and 2.8%, respectively

  • Translation approach in IR2, showed that Cross-Language Information Retrieval (CLIR) query translation using first translation listed in the dictionary is obtained a better result compared to using all translation candidates listed in the dictionary either in English or Malay document collections

Read more

Summary

Introduction

More than one billion people use the Internet and get a lot of benefit from the available information. This information written in their native language and in other non-native languages and expanded rapidly followed the growth of internet information. Information Retrieval (IR) generally refers to the process that user searches the needed information from a large number of documents. Traditional IR is implemented mainly for monolingual documents and only supports the retrieval of documents that are written in the same language as the query. Cross-Language Information Retrieval (CLIR) is intended to matching the user query written in one language with the documents written in other languages. In CLIR, systems automatically search documents written in other languages

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.