Abstract

AbstractIn order to manipulate XML data, a programming or query language should provide some primitives to deconstruct them, in particular to pinpoint and capture some subparts of the data.Among various proposals for primitives for deconstructing XML data, two different and complementary approaches seem to clearly stem from practise: path expressions (usually XPath paths [7], but also the “dot” navigation of Cω [3]) and regular expression patterns [13].Path expressions are navigational primitives that point out where to capture data substructures. They (and those of Cω, in particular) closely resemble the homonymous primitives used by OQL [9] in the contexts of OODB query languages with the difference that instead of sets of objects they return sets or sequences of elements: more precisely all elements that can be reached following the path at issue. These primitives are at the basis of standard languages such as XSLT [8] or XQuery [4].More recently, a new kind of deconstructing primitives was proposed, regular expression patterns [13], which extend by regular expressions the pattern matching primitive as popularised by functional languages such as ML and Haskell. Regular expression patterns were first introduced in the XDuce [12] programming language and are becoming more and more popular, since they are being adopted by such quite different languages as ℂDuce [1] (a general purpose extension of the XDuce language) and its query language ℂQL [2], Xtatic [10] (an extension of C#), Scala [15] (a general purpose Java-like object-oriented language that compiles into Java bytecode), XHaskell [14] as well as the extension of Haskell proposed in [5].The two kinds of primitives are not antagonists, but rather orthogonal and complementary. Path expressions implement a “vertical” exploration of data as they capture elements that may be at different depths, while patterns perform a “horizontal” exploration of data since they are able to perform finer grained decomposition on sequences of elements. The two kinds of primitives are quite useful and they mutually complement nicely. Therefore, it would seem natural to integrate both of them in a query or programming language for XML. Despite of that, we are aware of just two works in which both primitives are embedded (and, yet, loosely coupled): in ℂQL it is possible to write select-from-where expressions, where regular expression patterns are applied in the from clause to sequences that are returned by XPath-like expressions; Gapeyev and Pierce [11] show how it is possible to use regular expression patterns with an all match semantics to encode a subset of XPath and plan to use this encoding to add XPath to the Xtatic programming language.The reason for the lack of study of the integration of these two primitives may be due to the fact that each of them is adopted by a different community: regular patterns are almost confined to the programming language community while XPath expressions are pervasive in the database community.The goal of this lecture is to give a brief presentation of the regular pattern expressions style together with the type system to which they are tightly connected, that is the semantic subtyping based type systems [6]. We are not promoting the use of these to the detriment of path expressions, since we think that the two approaches should be integrated in the same language and we see in that a great opportunity of collaboration between the database and the programming languages communities. Since the author belongs to latter, this lecture tries to describe the pattern approach addressing some points that should be of interest to the database community as well. In particular, after a general overview of regular expression patterns and types in which we show how to embed patterns in a select_from_where expression, we discuss several usages of these patterns/types, going from the classic use for partial correctness and schema specification to the definition of new data iterators, from the specification of efficient run-time to the definition of logical pattern-specific query optimisations.KeywordsQuery LanguageQuery OptimisationPath ExpressionXPath ExpressionDatabase CommunityThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.