Abstract
Problem statement: The processing of the XML queries needs an efficient indexing method. The index generation includes the labeling of the nodes. But for the dynamic data, when the changes are made, the re-computation of the labeling is needed. Approach: Based on the structural indexing used the performance of the query processing system will be affected. The dynamic XML document allows inserting, deleting and updating operations. Results: Suppose for the frequently changed documents, the indexes will also be changed. i.e., it requires the re-computing the labels frequently. This leads to an inconvenient system. To avoid this problem, in the scheme New Labeling Scheme for XML (NLSX), only the small, capital letters and digits are used to generate persistent labels for the nodes in the document. In the proposed system New Labeling Scheme for XML using Unicode Characters, characters from Unicode Characters (NLSXU) are used. Thus it can provide more combinations of the characters for persistent labeling the nodes in the document so that it will very much reduce the space needed to store the labels. Using the proposed scheme NLSXU, the index size of the real world data sets will be greatly reduced by 81% of the existing scheme NLSX. The results shows that the proposed scheme NLSXU will reduce the size of the indexes of the synthetic data sets up to 26, 34, 71 and 95% than the NLSX, LSDX, GRP and SP schemes respectively. Also when compared to LSDX scheme, the NLSXU will reduce the time taken for generating the labels by 96 and 80% for the real world datasets and the synthetic data sets respectively. Conclusion: Finally when compared to NLSX scheme, the NLSXU will reduce the time taken for generating the labels by 66 and 15% for the real world datasets and the synthetic data sets respectively. Thus it will improve the performance of the query system.
Highlights
In these wonderful and privileged times, none of this would be possible without data
While processing the XML document, for each of the node, the unique persistent label value as an index will be assigned so that it gives the fast access to these nodes while querying the data
The numbering methods based on the prefix values were proposed by Cohen et al (2002), in which a specific code will be assigned for each node in the Unicode characters in labels: In the NLSXU, the labels will be generated using letters, digits and Unicode characters (Unicode Consortium, 2010)
Summary
In these wonderful and privileged times, none of this would be possible without data. The indexing methods of the XML documents involves with various numbering and labeling schemes. While processing the XML document, for each of the node, the unique persistent label value as an index will be assigned so that it gives the fast access to these nodes while querying the data. For the frequent updating of the document, instead of evaluating the query the re-computation of the labels will occupy most of the time in the query processing. The parent-child relationship can be inferred from the label values of the nodes. It is useful for structural query processing (http://www.w3.org/TR/xquery/). The proposed method provides the ways for inserting, deleting and updating nodes without changing their label values.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have