Abstract

In this paper, we propose an abstraction for maintaining XML data partition, especially for holistic twig joins processing in a cluster system through a multidimensional data model. As XML documents, XML schemas and queries are numerous and intricacy in our system, we extract their metadata to define such a relationship among them in a multidimensional data model. For the partitioning purpose, we propose a series of multidimensional analysis operations outlined in three basic steps: document clustering, query clustering and partition refinement. Each step yields partitions with their associated costs computed by a cost model that takes a query processing cost as the basis. During simulated distribution of partitions to cluster computers, we refine some partitions residing in an overloaded cluster node and redistribute them in order to achieve considerably well balanced costs among all cluster nodes. Finally, we show the effectiveness of our proposed method indicated by achieving minimized cost variance in the cluster system and good performance of query execution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call