Abstract
In this article the authors discuss the challenges of performing reasoning on large scale RDF datasets from the Web. Using ter-Horst’s pD* fragment of OWL as a base, the authors compose a rule-based framework for application to web data: they argue their decisions using observations of undesirable examples taken directly from the Web. The authors further temper their OWL fragment through consideration of “authoritative sources” which counter-acts an observed behaviour which they term “ontology hijacking”: new ontologies published on the Web re-defining the semantics of existing entities resident in other ontologies. They then present their system for performing rule-based forward-chaining reasoning which they call SAOR: Scalable Authoritative OWL Reasoner. Based upon observed characteristics of web data and reasoning in general, they design their system to scale: the system is based upon a separation of terminological data from assertional data and comprises of a lightweight in-memory index, on-disk sorts and file-scans. The authors evaluate their methods on a dataset in the order of a hundred million statements collected from real-world Web sources and present scale-up experiments on a dataset in the order of a billion statements collected from the Web.Request access from your librarian to read this article's full text.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have