Distributed Queries and Query Optimization in Schema-Based P2P-Systems

Ingo Brunkhorst,Wolfgang Nejdl,Hadhami Dhraief,Alfons Kemper,Christian Wiesner

doi:10.1007/978-3-540-24629-9_14

Abstract

AbstractDatabases have employed a schema-based approach to store and retrieve structured data for decades. For peer-to-peer (P2P) networks, similar approaches are just beginning to emerge, also motivated by the fact, that sending (atomic) queries to the appropriate peers clearly fails for queries which need data from more than one peer to be executed. While quite a few database techniques can be re-used in this new context, a P2P data management infrastructure poses additional challenges which have to be solved before schema-based P2P networks become as common as schema-based databases. Because of the dynamic nature of P2P networks, we can neither assume global knowledge about data distribution, nor are static topologies and static query plans suitable for these networks. Unlike in traditional distributed database systems, we cannot assume a complete schema instance but rather work with a distributed schema which directs query processing tasks from one node to one or more neighboring nodes.In this paper, we will first discuss a suitable topology for schema-based P2P networks and how distributed knowledge about data distribution can be stored, accessed and updated based on that topology. Second we will describe how this knowledge can be used to distribute abstract query plans through the P2P network and expand them on the fly such that we can place query operators next to data sources and utilize distributed computing resources more effectively.KeywordsSpan TreeQuery ProcessingQuery OptimizationQuery OperatorQuery PlanThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text