Abstract

English is an international language used for communication worldwide but still many cannot read, write, understand, or communicate in English. On the other hand, the World Wide Web has unlimited resources of information in different languages which English native find challenging to understand. To avoid such barriers, Cross-Language Information Retrieval (CLIR) systems are proposed, which refers to document retrieval tasks across different languages. This work focuses on the performance evaluation of different Information Retrieval (IR) models in CLIR system using Quran dataset. Furthermore, this work also investigated the length of query and query expansion models for effective retrieval. The results show that different length of queries has an impact on the performance of the retrieval methods in terms of effectiveness. Hence, after comprehensive experiments, an appropriate length of query for Arabic CLIR system is suggested along with the best query expansion and retrieval model.

Highlights

  • I NDIVIDUALS need the relevant information carved in their natural language, commonly in form of a query

  • TThe representation issue is more obvious in Cross-Language Information Retrieval (CLIR) and Multi-Lingual Information Retrieval (MLIR), in which the documents and the queries are defined in different languages

  • The translation model segment can be used in different ways such as a) representing the document into the query representation space, the method is known as Document Translation Approach DTA [4], b) b) representing the query into the document terms space, the method is known as Query Translation Approach QTA [5]

Read more

Summary

Introduction

I NDIVIDUALS need the relevant information carved in their natural language, commonly in form of a query. There are cumulative desires to the exploration of information in languages dissimilar to the query. Retrieving documents written in the Arabic language with a query written in English. This creates a problem of Cross-Language Information Retrieval [1], [2], [3]. TThe representation issue is more obvious in CLIR and Multi-Lingual Information Retrieval (MLIR), in which the documents and the queries are defined in different languages. When the content is written in various languages, how can one build a similar inner depiction of queries when they ask for information? The important issue in CLIR is to implements a matching method in which terms similar to the other languages that define the similar sense. The translation model segment can be used in different ways such as a) representing the document into the query representation space, the method is known as Document Translation Approach DTA [4], b) b) representing the query into the document terms space, the method is known as Query Translation Approach QTA [5]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call