Abstract

Minimizing the communication time due to the transfer over a network of the intermediary results produced during the execution of a distributed query is a fundamental problem in distributed database management systems. We take a new look at this problem by investigating the relationship between the communication time and a remote data access middleware. We focus on two middleware parameters that are manually tuned by database administrators or programmers: the fetch size (i.e., the number of tuples that are communicated at once) and the message size (i.e., the size of the buffer at the middleware level). We present an experimental study which shows that these parameters have a crucial impact on the communication time. Then, we propose the MIND framework, which tunes the aforementioned middleware parameters, while adapting to different queries (that may vary in terms of selectivity) and networks (that may vary in terms of bandwidth). The main technical contributions of MIND are (i) a communication time estimation function that takes into account the middleware parameters, the size of the query result and the network environment, and (ii) an iterative optimization algorithm to find the fetch size and the message size that allow a good trade-off between low resource consumption and low communication time. We conclude with an experimental study that emphasizes the effectiveness of the MIND framework.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call