Abstract

Vast amounts of data, especially in biomedical research, are being published as Linked Data. Being able to analyze these data sets is essential for creating new knowledge and better decision support solutions. Many of the current analytics solutions require continuous access to these data sets. However, accessing Linked Data at query time is prohibitive due to high latency in searching the content and the limited capacity of current tools to connect to these databases. To reduce this overhead cost, modern database systems maintain a cache of previously searched content. The challenge with Linked Data is that databases are constantly evolving and cached content quickly becomes outdated. To overcome this challenge, we propose a Change-Aware Maintenance Policy (CAMP) for updating cached content. We propose a Change Metric that quantifies the evolution of a Linked Dataset and determines when to update cached content. We evaluate our approach on two datasets and show that CAMP can reduce maintenance costs, improve maintenance quality and increase cache hit rates compared to standard approaches.

Highlights

  • In the recent past, massive amounts of data are available publicly and these data are produced at an alarming rate in all domains of medical sciences, creating what we today call “Big Data”

  • We propose a proactive maintenance policy called Change-Aware Maintenance Policy (CAMP) to update the local view by issuing the maintenance jobs during system idle time

  • Data is another important concept in the Semantic Web, enabling the machine to browse the Web of data, such as DBpedia

Read more

Summary

Introduction

Massive amounts of data are available publicly and these data are produced at an alarming rate in all domains of medical sciences, creating what we today call “Big Data”. Three kinds of conventional maintenance policy have been reported for data analytics systems [17]: immediate (update immediately after data arrives), deferred (no execution is performed on current query evaluation) and periodic (update the local views on a regular bases). These conventional policies fail to effectively optimize local views for linking data due to slow response time. In state-of-the-art approaches, the maintenance of local views is performed immediately whenever new data arrives.

Background
Semantic Web Technologies
Life Sciences Linked Data
Related Work
Proposed Methodology
Change Metric
Query Matching
Maintenance Manager
Scheduling Updates of Linked Data
Baseline Approaches
Performance Evaluation
Maintenance Cost
Maintenance Quality
Cache Hit Rates
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.