Abstract

Massive amounts of data are currently available and being produced at an unprecedented rate in all domains of life sciences worldwide. However, this data is disparately stored and is in different and unstructured formats making it very hard to integrate. In this review, we examine the state of the art and propose the use of the Linked Data (LD) paradigm, which is a set of best practices for publishing and connecting structured data on the Web in a semantically meaningful format. We argue that utilizing LD in the life sciences will make data sets better Findable, Accessible, Interoperable, and Reusable. We identify three tiers of the research cycle in life sciences, namely (i) systematic review of the existing body of knowledge, (ii) meta-analysis of data, and (iii) knowledge discovery of novel links across different evidence streams to primarily utilize the proposed LD paradigm. Finally, we demonstrate the use of LD in three use case scenarios along the same research question and discuss the future of data/knowledge integration in life sciences and the challenges ahead.

Highlights

  • Tremendous amounts of data are currently publicly available, and more is being produced at an unprecedented rate in all domains of life sciences worldwide, promising to generate solutions to diverse problems in health and medicine

  • We examine the state of the art followed by demonstrating, through mock examples, the utility of Linked Data (LD) using the case study starting with a research question: “Can obesity be a potential cause for breast cancer in later life?” In particular, we start off with (1) analyzing the systematical review process from the existing body of knowledge from published articles, including all the mechanisms described, linking obesity to breast cancer via different mechanisms

  • Using the Semantic Web technologies and Linked Data, we demonstrate their utility in three specific use cases: (1) systematic reviews, (2) meta-analysis, and (3) knowledge discovery

Read more

Summary

Background

Tremendous amounts of data are currently publicly available, and more is being produced at an unprecedented rate in all domains of life sciences worldwide, promising to generate solutions to diverse problems in health and medicine. We propose avenues for the use of Linked Data in the Life Sciences by identifying three tiers of research practice that are data intensive, time consuming, and lack support from the data science perspective, which would potentially benefit from a paradigm shift These three tiers are (1) performing a systematic review of the existing body of knowledge, (2) performing a meta-analysis from the literature retrieved as a result of the systematic review, and (3) knowledge discovery of novel links across different evidence streams and databases. We propose an LD-based approach that uses these Semantic Web technologies and data sets to enable all three tiers of the research process namely, (1) systematic reviews, (2) meta-analysis, and (3) knowledge discovery.

Semantic Web Technologies
Data Sets
Proposed
Use Cases
Systematic Reviews
Meta-Analysis
Knowledge Discovery
Deployment
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call