Abstract
English is an international language used for communication worldwide but still many cannot read, write, understand, or communicate in English. On the other hand, the World Wide Web has unlimited resources of information in different languages which English native find challenging to understand. To avoid such barriers, Cross-Language Information Retrieval (CLIR) systems are proposed, which refers to document retrieval tasks across different languages. This work focuses on the performance evaluation of different Information Retrieval (IR) models in CLIR system using Quran dataset. Furthermore, this work also investigated the length of query and query expansion models for effective retrieval. The results show that different length of queries has an impact on the performance of the retrieval methods in terms of effectiveness. Hence, after comprehensive experiments, an appropriate length of query for Arabic CLIR system is suggested along with the best query expansion and retrieval model.
Highlights
I NDIVIDUALS need the relevant information carved in their natural language, commonly in form of a query
TThe representation issue is more obvious in Cross-Language Information Retrieval (CLIR) and Multi-Lingual Information Retrieval (MLIR), in which the documents and the queries are defined in different languages
The translation model segment can be used in different ways such as a) representing the document into the query representation space, the method is known as Document Translation Approach DTA [4], b) b) representing the query into the document terms space, the method is known as Query Translation Approach QTA [5]
Summary
I NDIVIDUALS need the relevant information carved in their natural language, commonly in form of a query. There are cumulative desires to the exploration of information in languages dissimilar to the query. Retrieving documents written in the Arabic language with a query written in English. This creates a problem of Cross-Language Information Retrieval [1], [2], [3]. TThe representation issue is more obvious in CLIR and Multi-Lingual Information Retrieval (MLIR), in which the documents and the queries are defined in different languages. When the content is written in various languages, how can one build a similar inner depiction of queries when they ask for information? The important issue in CLIR is to implements a matching method in which terms similar to the other languages that define the similar sense. The translation model segment can be used in different ways such as a) representing the document into the query representation space, the method is known as Document Translation Approach DTA [4], b) b) representing the query into the document terms space, the method is known as Query Translation Approach QTA [5]
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.