Improving density-based methods for hierarchical clustering of web pages

Morteza Haghir Chehreghani,Mostafa Haghir Chehreghani,Hassan Abolhassani

doi:10.1016/j.datak.2008.06.006

Abstract

The rapid increase of information on the web makes it necessary to improve information management techniques. One of the most important techniques is clustering web data. In this paper, we propose a new 3-phase clustering method that finds dense units in a data set using density-based algorithms. The distances in the dense units are stored in order in structures such as a min heap. In the extraction stage, these distances are extracted one by one, and their effects on the clustering process are examined. Finally, in the combination stage, clustering is completed using improved versions of well-known single and average linkage methods. All steps of the methods are performed in O(n log n) time complexity. The proposed methods have the benefit of low complexity, and experimental results show they generate clusters with high quality. Other experiments also show that they provide additional advantages, such as clustering by sampling.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving density-based methods for hierarchical clustering of web pages

Abstract

Talk to us

Similar Papers

More From: Data & Knowledge Engineering

Lead the way for us

Journal: Data & Knowledge Engineering	Publication Date: Jun 24, 2008
Citations: 37

Similar Papers

PENERAPAN ANALISIS KLASTER HIERARKI UNTUK PENGELOMPOKAN KABUPATEN/ KOTA DI PROVINSI MALUKU BERDASARKAN STATUS PENDIDIKAN
Novita Serly Laamena ... Taufan Talib
Science Map Journal | VOL. 5
Novita Serly Laamena, et. al.Novita Serly Laamena ... Taufan Talib
06 Jun 2023
Science Map Journal | VOL. 5

Comparison of Single Linkage, Complete Linkage, and Average Linkage Methods on Community Welfare Analysis in Cities and Regencies in East Java
Yanuwar Reinaldi ... Nurissaidah Ulinnuha
Jurnal Matematika, Statistika dan Komputasi | VOL. 18
Yanuwar Reinaldi, et. al.Yanuwar Reinaldi ... Nurissaidah Ulinnuha
02 Sep 2021
Jurnal Matematika, Statistika dan Komputasi | VOL. 18

Application of Cluster Analysis Using Agglomerative Method
Muhammad Rais Ridwan ... Heri Retnawati
Numerical: Jurnal Matematika dan Pendidikan Matematika | VOL. -
Muhammad Rais Ridwan, et. al.Muhammad Rais Ridwan ... Heri Retnawati
28 Jun 2021
Numerical: Jurnal Matematika dan Pendidikan Matematika | VOL. -

Evaluation of a hierarchical ascendant clustering process implemented in a veterinary syndromic surveillance system
Isabelle Behaeghel ... Marc Dispas
Preventive Veterinary Medicine | VOL. 120
Isabelle Behaeghel, et. al.Isabelle Behaeghel ... Marc Dispas
17 Mar 2015
Preventive Veterinary Medicine | VOL. 120

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving density-based methods for hierarchical clustering of web pages

Abstract

Talk to us

Similar Papers

More From: Data &amp; Knowledge Engineering

More From: Data & Knowledge Engineering