Abstract

We present the systematic design and development of a distributed query scheduling service (DQS) in the context of DIOM, a distributed and interoperable query mediation system.26 DQS consists of an extensible architecture for distributed query processing, a three-phase optimization algorithm for generating efficient query execution schedules, and a prototype implementation. Functionally, two important execution models of distributed queries, namely moving query to data or moving data to query, are supported and combined into a unified framework, allowing the data sources with limited search and filtering capabilities to be incorporated through wrappers into the distributed query scheduling process. Algorithmically, conventional optimization factors (such as join order) are considered separately from and refined by distributed system factors (such as data distribution, execution location, heterogeneous host capabilities), allowing for stepwise refinement through three optimization phases: Compilation, parallelization, site selection and execution. A subset of DQS algorithms has been implemented in Java to demonstrate the practicality of the architecture and the usefulness of the distributed query scheduling algorithm in optimizing execution schedules for inter-site queries.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call