Abstract

This thesis focuses on cluster-based parallel database systems in which only one of the nodes has the database and the other nodes, which have no initial data, are used for parallel query processing. In such a system, the load of each node changes dynamically depending on the activities of the local users. In addition, in database query processing, data skew exists. Thus, it is very important to develop efficient load balancing/sharing algorithms. With the increasing Internet connectivity, it also becomes necessary to query databases spread around the world. This can incur arrival delay and/or the transfer rate fluctuations of input relations. This thesis investigates query processing algorithms (single- and multiple-join) under these conditions. For the single-join processing, a new algorithm which divides the hash buckets into chunks and uses them for load balancing is proposed. Two versions of this algorithm are introduced: a non-symmetric join algorithm (ChunkHJ) and a symmetric join algorithm (SCHJ). ChunkHJ is compared with two other algorithms: an adaptive nested-loop join and an adaptive GRACE join. These three algorithms are evaluated on a heterogeneous cluster with skewed data and various non-query background load. The results show that the proposed ChunkHJ is the best algorithm. SCHJ is compared with a dynamic round-robin algorithm, a sampling algorithm (without the Internet transfer delay) and an incremental hash mapping algorithm (with the Internet transfer delay). The results show that without the Internet transfer delay, the SCHJ algorithm is the best among these algorithms. With Internet transfer delay, the SCHJ algorithm is still better than the other incremental hash mapping algorithms. For the multiple-join processing, the Non-Symmetric Pipelined Hash Join (NSPHJ) and Symmetric Pipelined Hash Join (SPHJ) algorithms are evaluated with and without the Internet transfer delay. The results show that NSPHJ should be used for a cluster without the Internet transfer delay. Otherwise, SPHJ should be used.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.