Abstract

A new framework for document verification is presented which covers the entire process from document analysis through information extraction, document modeling, representation of background knowledge about the domain of discourse, user level and formal representation of consistency criteria, verification by model checking, counterexample generation, and error reporting. Emphasis is placed on employing background knowledge to reduce the complexity and to increase the quality of results in each step. A rule-based approach to information extraction supports the concise definition of extraction rules for document formats based on XML or HTML. The expressiveness of the existing extraction methods is exceeded by supporting rule specialization, integration of external tools, and access to background knowledge represented in ontologies. As a formal basis for representing consistency criteria, the new temporal description logic ALC CTL is proposed. In contrast to the existing formalisms, criteria related to the coherence of content along individual paths of reading can be represented and verified efficiently. The adequacy, performance, and effectiveness of the proposed framework is demonstrated on a case study in technical documentation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.