Abstract

Extensible markup language (XML), a simplified version of standard generalized markup language (SGML), is designed to enable electronic text interchange in the Internet. XML documents have a rigorously described structure that may be analyzed by computers and easily understood by humans. Most current approaches store XML documents in file systems or in relational database systems. However, the nature and the design of file system or relational database schema may cause limitations on fitting with XML document structure. In this paper, we present an automatic load/extract scheme to store and retrieve XML documents through object-relational databases. We propose an architecture, called XML meta-generator (XMG), which, after reading a specific document type definition (DTD), automatically generates the corresponding object-relational database schema (OR-Schema), a DI-Decomposer and a DI-Reconstructor, which are explained as follows: 1. OR-Schema––an object-relational database schema in UniSQL/X format for a specific DTD. 2. DI-Decomposer––a module decomposes XML document instances (DIs) according to the specific DTD format and stores the elements into the corresponding object-relational database. 3. DI-Reconstructor––a module retrieves elements from the object-relational database and reconstructs it to recover the original DI. These modules make XML documents be automatically decomposed into and reconstructed from object-relational databases in a seamless manner. Moreover, documents stored in the object-relational databases can be managed and inquired more easily than it could be in file systems or relational databases. Useful applications on various documents can also be easily built on top of the target database, such as digital libraries, data warehouses, and data or text mining systems.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.