Abstract

Open data standards (e.g. LandXML, TransXML, CityGML) are a key to addressing the interoperability issue in exchanging civil information modeling (CIM) data throughout the project life-cycle. Since these schemas include rich sets of data types covering a wide range of assets and disciplines, model view definitions (MVDs) which define subsets of a schema are required to specify what types of data to be shared in accordance with a specific exchange scenario. The traditional procedure for generating and implementing MVDs is time-consuming and laborious as entities and attributes relevant to a particular data exchange context are manually identified by domain experts. This paper presents a method that can locate relevant information from a source XML data schema for a specific domain based on the user's keyword. The study employs a semantic resource of civil engineering terms to understand the semantics of a keyword-based query. The study also introduces a novel context-based search technique for retrieving related entities and their referenced objects. The developed method was tested on a gold standard of several LandXML subschemas. The experiment results show that the semantic MVD retrieval algorithm achieves a mean average precision of nearly 90%. The research is original, being a novel method for extracting partial civil information models given a keyword from the end user. The method is expected to become a fundamental tool assisting professionals in extracting data from complex digital datasets.

Highlights

  • Neutral data standards have been widely accepted as an effective means for transferring the Civil Information Modeling (CIM) data of a civil infrastructure asset between project stakeholders

  • As discussed, validating the syntactic correctness of a model view is currently well supported by various rulebased algorithms. This automated validation function can significantly reduce the burden on software developers; they, are still required to have a deep understanding of the semantics of the IFC schema to properly match with the data exchange elements in an Information Delivery Manual (IDM)

  • Model view definition has been widely recognized as a means for facilitating seamless information exchange throughout the project life-cycle

Read more

Summary

INTRODUCTION

Neutral data standards have been widely accepted as an effective means for transferring the Civil Information Modeling (CIM) data of a civil infrastructure asset between project stakeholders. Developers need to interpret the semantics of the data keywords in an IDM and look for matching entities, attributes, and types in the source schema This task becomes extremely challenging especially for such large standards as LandXML and IFC which keep growing every year. Open standards are structured using a systematic categorization method, i.e., by assets in CityGML or by disciplines in IFC, developers may still need to go through the entire schema since a use case typically requires data from across different groups This is even more problematic when the terms used in the IDM are inconsistent with the entity names of the source schema. The sections describe related studies, knowledge gaps, the architecture of the framework, followed by discussions on implications and conclusions

Previous Studies on Automated MVD generation
Knowledge Gap
KEYWORD-DRIVEN METHODOLOGY FOR GENERATING XML SUBSCHEMAS
SCHEMA NETWORK CONSTRUCTION AND INDEXING
QUERY SEMANTICS INTERPRETATION
Domain knowledge base
Keyword expansion and query concept formulation
NODE MATCHING AND RANKING
Label String Similarity
BRANCH SEARCH AND MVD COMPOSITION
Branch Traversal Algorithm
Branch Merging for Subnetwork Formation
Experiment setup
Results and discussions
RESEARCH CONTRIBUTIONS AND IMPLICATIONS
10. CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call