Abstract
Problem statement: To improve the performance of data retrieval in a homogeneous large XML document. Approach: Clustering of XML elements based on the content with indexing. The element which is used for clustering has been identified from the document and/or XML schema. This element is used as a parameter for clustering. The suitable index is created after clustering. Results: The clustering combined with indexing strategy support the efficient retrieval of XML element from the document. Conclusion: The proposed method is used to improve the efficiency of XML data manipulation and comparatively give the better performance rather than clustering or indexing alone.
Highlights
The evolution of Internet and communication technologies supported the business enterprises to get more benefits
In this study the performance of proposed similarity based clustering method is analyzed on SQL Sever2005 database engine and oracle Berkely Database XML (BDXML)
It is a method of XML structural representation called Common XPath (CXP), which encodes the frequently occurring elements with the hierarchical information and proposed to take the CXPs mined to form the feature vectors for XML document clustering
Summary
The evolution of Internet and communication technologies supported the business enterprises to get more benefits. A clustering based on path pattern is presented (Leung et al, 2005b) It is a method of XML structural representation called Common XPath (CXP), which encodes the frequently occurring elements with the hierarchical information and proposed to take the CXPs mined to form the feature vectors for XML document clustering. Which differ from each others in terms of the set of allowed edit operators and their support for repetitive and optional XML elements It has been proved in (Zhang et al, 1992) that computing the edit distance for unordered labeled trees is NP-Complete and yet the Fig. 1: Pseudo code for clustering algorithm distance is not optimal in any sense related to the elements semantics.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.