Evaluation of Average Term Occurrences Weighting Technique for Arabic Textual Information Retrieval

Belal Mustafa Abuata,Lama Ali Al Omari

doi:10.18517/ijaseit.12.6.13215

Belal Mustafa Abuata, Lama Ali Al Omari

Open Access

PDF Available

https://doi.org/10.18517/ijaseit.12.6.13215

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Information retrieval of documents is an important process in the current time, and the vector space retrieval model uses a term weighting scheme as a basic method for matching queries with documents. Term frequency-Inverse document frequency is a widely used and famous term weighting scheme, and many studies proved its effectiveness in information retrieval. However, this term weighting scheme has some drawbacks like retrieving irrelevant documents, which sometimes reduces effectiveness. From this point, a new term weighting scheme called Term Frequency with Average Term Occurrence was proposed and experienced in the English language to minimize retrieving unnecessary documents. In this paper, an information retrieval system is built for the Arabic language, and Open-Source Arabic Corpora was used to complete experiments. Calculations were made using two schemes which are traditional Term frequency-inverse Document Frequency and proposed Term Frequency with Average Term Occurrence. After that, comparisons of results were made using evaluation measures. With all obtained queries, four case studies with two approaches (stop word removal and stemming) are implemented. In English experiments, stop word removal was applied with another discriminative approach, which calculates the centroid of documents. After the analysis of the results, it was found that the proposed scheme is applicable on Arabic text and applied approaches enhance IR effectiveness if they are both implemented. Furthermore, it was found that stop word removal has a favorable effect on both schemes which was also proved in English experiments.

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Evaluation of Average Term Occurrences Weighting Technique for Arabic Textual Information Retrieval

Abstract

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal on Advanced Science, Engineering and Information Technology

Lead the way for us

Journal: International Journal on Advanced Science, Engineering and Information Technology	Publication Date: Dec 11, 2022
License type: cc-by-sa

Similar Papers

Term frequency with average term occurrences for textual information retrieval
O Ali Sadek Ibrahim ... D Landa-Silva
Soft Computing | VOL. 20
O Ali Sadek Ibrahim, et. al.O Ali Sadek Ibrahim ... D Landa-Silva
28 Nov 2015
Soft Computing | VOL. 20

Turning from TF-IDF to TF-IGM for term weighting in text classification
Kewen Chen ... Hao Zhang
Expert Systems with Applications | VOL. 66
Kewen Chen, et. al.Kewen Chen ... Hao Zhang
09 Sep 2016
Expert Systems with Applications | VOL. 66

TF-TDA: A Novel Supervised Term Weighting Scheme for Sentiment Analysis
Arwa Alshehri ... Abdulmohsen Algarni
Electronics | VOL. 12
Arwa Alshehri, et. al.Arwa Alshehri ... Abdulmohsen Algarni
30 Mar 2023
Electronics | VOL. 12

A new weighting scheme and discriminative approach for information retrieval in static and dynamic document collections
Osman A S Ibrahim ... Dario Landa-Silva
-
Osman A S Ibrahim, et. al.Osman A S Ibrahim ... Dario Landa-Silva
01 Sep 2014
01 Sep 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Evaluation of Average Term Occurrences Weighting Technique for Arabic Textual Information Retrieval

Abstract

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal on Advanced Science, Engineering and Information Technology