Towards Adaptively Approximated Search in Distributed Architectures

Barbara Catania,Giovanna Guerrini

doi:10.1007/978-3-642-17551-0_7

Abstract

Innovative applications over distributed architectures, like the Web, often require the analysis of strongly related, highly heterogeneous data, stored in remote and autonomous data sources, that can be either totally available at query processing time (stored data) or become available in a continuous stream (data stream). In these contexts, search efficiency is a key issue. However, classical processing techniques, according to which queries are executed exactly, both for what concerns the request and for what concerns the processing technique, which is set at the beginning of the execution, may not ensure adequate performance and quality (in terms of completeness and of accuracy) of the returned result. To overcome such problem, approximate and adaptive query processing techniques have been proposed. Adaptive techniques aim at ensuring an efficient query processing whenever a priori information, needed to statically select once at the beginning of the processing the most efficient processing technique, is not available. Approximation, by contrast, has been proposed for ensuring a higher result quality in presence of data heterogeneity and limited data knowledge. In highly dynamic and heterogeneous environments, these two approaches have usually been considered as orthogonal. However, we claim that applications exist that could benefit from a combined approach. An example are Web applications allowing to specify queries on heterogeneous data (streams), retrieved through mash-up from different sites. Since data are dynamically acquired, they cannot be statically reconciled, before processing queries. Moreover, adopting a single approximate search strategy, fixed a priori, could penalize the system efficiency and/or the quality of result, whenever heterogeneity only characterizes subsets of input data. The aim of this chapter is to make one step towards the integration of such approaches by introducing Approximate Search with Adaptive Processing (ASAP for short) systems. In ASAP, decisions concerning when, how, and how much to approximate are taken dynamically, with the goal of optimizing both the quality of result and the efficiency of processing.

Full Text