Abstract

Distributed data sources can be heterogeneous in their formats, schemas, quality, access mechanisms, ownership, access policies, and capabilities. We need models and techniques for managing different data resources in an integrated way. Data integration is the flexible and managed federation, analysis, and processing of data from different distributed sources. Data integration is becoming as important as data mining for exploiting the value of large and distributed data sets that today are available. Distributed processing infrastructures such as Grids and peer-to-peer networks can be used for data integration on geographically distributed sites. This paper presents a service-based architecture for data integration on Grids. The basic model is discussed and its implementation based on the OGSA Globus architecture is described.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call