Abstract

Modern data analytic systems benefit from the fusion of streaming data and linked data distributed on the Web. Accessing the linked data at query time is prohibited as usual due to its expensive cost. To reduce the high cost, most of the database systems have used a materialized view (a view ) that stores local copies of the data. However, views by conventional maintenance policies such as immediate, deferred, and periodic fail to achieve high accuracy of answers to queries on data streams and linked data. To cope with the limitations, we propose a maintenance policy that releases expensive jobs of copying the latest version of linked data into views at the idle time. In other words, we pre-fetch a portion of linked data in advance according to their update pattern and query evaluation semantics. Our multiple maintenance policies that take into account changes of linked data alleviate the degradation of performance at run-time. Using real-world datasets we report that the proposed method has a significant improvement in terms of the response time, compared to the state-of-the-art methods.

Highlights

  • The Web of Data [1] enables the development of a global data environment where data providers such as individuals, companies, and governments publish their rich knowledge like linked data1 constantly

  • RDF Stream Processing (RSP) is based on RDF.3 and SPARQL4 but extends them to express a temporal dimension of RDF data and produce a stream of answers

  • GENERALITY OF THE PREREQUISITES The three prerequisites determine a situation whether the proactive maintenance policy is adopted or not

Read more

Summary

Introduction

The Web of Data [1] enables the development of a global data environment where data providers such as individuals, companies, and governments publish their rich knowledge like linked data constantly. Existing RSP engines such as CQELS [6], C-SPARQL [12], SPARQLSTREAM [7], and EP-SPARQL [8], provide their data stream model where RDF data contain a temporal information such as a timestamp and time interval. They support operations to handle their data with temporal information and enables a continuous evaluation where query executions happen at multiple time points. The latter checks whether many updates are generated over a time period or not

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call