Web Page Ranking Based on Text Content and Link Information Using Data Mining Techniques

Esraa Q Naamha,Matheel E Abdulmunim

doi:10.14500/aro.11397

Esraa Q Naamha, Matheel E Abdulmunim

Open Access

PDF Available

https://doi.org/10.14500/aro.11397

Copy DOI

Export

Save

Cite

Journal: ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY	Publication Date: Feb 16, 2024
License type: CC BY-NC-SA 4.0

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Thanks to the rapid expansion of the Internet, anyone can now access a vast array of information online. However, as the volume of web content continues to grow exponentially, search engines face challenges in delivering relevant results. Early search engines primarily relied on the words or phrases found within web pages to index and rank them. While this approach had its merits, it often resulted in irrelevant or inaccurate results. To address this issue, more advanced search engines began incorporating the hyperlink structures of web pages to help determine their relevance. While this method improved retrieval accuracy to some extent, it still had limitations, as it did not consider the actual content of web pages. The objective of the work is to enhance Web Information Retrieval methods by leveraging three key components: text content analysis, link analysis, and log file analysis. By integrating insights from these multiple data sources, the goal is to achieve a more accurate and effective ranking of relevant web pages in the retrieved document set, ultimately enhancing the user experience and delivering more precise search results the proposed system was tested with both multi-word and single-word queries, and the results were evaluated using metrics such as relative recall, precision, and F-measure. When compared to Google’s PageRank algorithm, the proposed system demonstrated superior performance, achieving an 81% mean average precision, 56% average relative recall, and a 66% F-measure.

Full Text