XCLSC: Structure and content-based clustering of XML documents

Karima Bessine,Abdelouahab Moussaoui,Hadda Cherroun,Attia Nehar

doi:10.1109/isps.2015.7244989

Abstract

This paper proposes a novel Clustering approach for XML documents that combines both their content and structure information using tree structural-content summaries in order to reduce the size of the document. This reduction has twofold purpose. First, it reduces the size of the XML tree by eliminating redundant nodes. Second, it gathers similaire content. The clustering is performed according to a similarity measure that takes into account the structure and the content between levels. Several experiments are performed to explore the effectiveness of using tree structural summaries and constrained content in the clustering process. Empirical analysis reveals that the designed clustering approach using content within structure and tree structural summaries gives a better solution for XML clustering while improving runtime. It is very suitable when we deal with big data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

XCLSC: Structure and content-based clustering of XML documents

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Collaborative clustering of XML documents
Sergio Greco ... Andrea Tagarelli
Journal of Computer and System Sciences | VOL. 77
Sergio Greco, et. al.Sergio Greco ... Andrea Tagarelli
05 Mar 2011
Journal of Computer and System Sciences | VOL. 77

Clustering of XML documents based on structure and aggregated content
Nermeen Gamal Rezk ... Alsayed Algergawy
-
Nermeen Gamal Rezk, et. al.Nermeen Gamal Rezk ... Alsayed Algergawy
01 Dec 2016
01 Dec 2016

Organizing XML Documents on a Peer–to–Peer Network by Collaborative Clustering
Francesco Gullo ... Sergio Greco
-
Francesco Gullo, et. al.Francesco Gullo ... Sergio Greco
01 Jan 2012
01 Jan 2012

Structure and Content Similarity for Clustering XML Documents
Lijun Zhang ... Zhanhuai Li
-
Lijun Zhang, et. al.Lijun Zhang ... Zhanhuai Li
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

XCLSC: Structure and content-based clustering of XML documents

Abstract

Talk to us

Similar Papers