Abstract

Several researchers have considered integrating multiple unstructured, semi-structured, and structured data sources by modeling all sources as edge labeled graphs. Data in this model is self-describing and dynamically typed, and captures both schema and data information. The labels are arbitrary atomic values, such as strings, integers, reals, etc., and the integrated data graph is stored in a unique data repository, as a relation of edges. The relation is dynamically typed, i.e. each edge label is tagged with its type.Although the unique, labeled graph repository is flexible, it looses all static type information, and results in severe efficiency penalties compared to querying structured databases, such as relational or object-oriented databases. In this paper we propose an alternative method of storing and querying semi-structured data, using storage schemas, which are closely related to recently introduced graph schemas [BDFS97]. A storage schema splits the graph's edges into several relations, some of which may have labels of known types (such as strings or integers) while others may be still dynamically typed. We show here that all positive queries in UnQL, a query language for semistructured data, can be translated into conjunctive queries against the relations in the storage schema. This result may be surprising, because UnQL is a powerful language, featuring regular path expressions, restructuring queries, joins, and unions. We use this technique in order to translate queries on the integrated, semi-structured data into queries on the external sources. In this setting the integrated semi-structured data is not materialized but virtual and the problem is to translate a query against the integrated view, possibly involving regular path expressions and restructuring, into queries which can be answered by the external sources. Here we use again the storage schema in order to split the graph into relations according to their sources. Any positive UnQL query is decomposed based on these relations and translated into queries on the external sources.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.