Integrating domain heterogeneous data sources using decomposition aggregation queries

Jian Xu,Rachel Pottinger

doi:10.1016/j.is.2013.06.003

Abstract

The decomposition aggregation query (DAQ) we introduce in this paper extends semantic integration queries by allowing query translation to create aggregate queries based on the DAQ's novel three role structure. We describe the application of DAQs in integrating domain heterogeneous data sources, the new semantics of DAQ answers and the query translation algorithm called “aggregation rewriting”.A central problem of optimizing DAQ processing requires determining the data sources towards which the DAQ is translated. Our source selection algorithm has cover-finding and partitioning steps which are optimized to 1. lower the processing overhead while speeding up query answering and 2. eliminate duplicates with minimal overhead. We establish connections between source selection optimizations and classic NP-hard optimizations and resolve the optimization problems with efficient solvers. We empirically study both the DAQ query translation and the source selection algorithms using real-world and synthetic data sets; the results show satisfying scalability both in size of aggregations and data sources for the query translation algorithms and the source selection algorithms save a good amount of computational resources.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Integrating domain heterogeneous data sources using decomposition aggregation queries

Abstract

Talk to us

Similar Papers

More From: Information Systems

Lead the way for us

Journal: Information Systems	Publication Date: Jun 19, 2013
Citations: 8

Similar Papers

Source selection for real-time user intent recognition toward volitional control of artificial legs.
Fan Zhang ... He Huang
IEEE Journal of Biomedical and Health Informatics | VOL. 17
Fan Zhang, et. al. Fan Zhang ... He Huang
01 Sep 2013
IEEE Journal of Biomedical and Health Informatics | VOL. 17

An access cost-aware approach for object retrieval over multiple sources
Benjamin Arai ... Vagelis Hristidis
Proceedings of the VLDB Endowment | VOL. 3
Benjamin Arai, et. al.Benjamin Arai ... Vagelis Hristidis
01 Sep 2010
Proceedings of the VLDB Endowment | VOL. 3

Efficient Feedback Collection for Pay-as-you-go Source Selection
Julio César Cortés Ríos ... Khalid Belhajjame
-
Julio César Cortés Ríos, et. al.Julio César Cortés Ríos ... Khalid Belhajjame
18 Jul 2016
18 Jul 2016

Simple Adaptations of Data Fusion Algorithms for Source Selection
Georgios Paltoglou ... Maria Satratzemi
-
Georgios Paltoglou, et. al.Georgios Paltoglou ... Maria Satratzemi
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Integrating domain heterogeneous data sources using decomposition aggregation queries

Abstract

Talk to us

Similar Papers

More From: Information Systems