Abstract

Hypothesis generation is a critical step in research and a cornerstone in the rare disease field. Research is most efficient when those hypotheses are based on the entirety of knowledge known to date. Systematic review articles are commonly used in biomedicine to summarize existing knowledge and contextualize experimental data. But the information contained within review articles is typically only expressed as free-text, which is difficult to use computationally. Researchers struggle to navigate, collect and remix prior knowledge as it is scattered in several silos without seamless integration and access. This lack of a structured information framework hinders research by both experimental and computational scientists. To better organize knowledge and data, we built a structured review article that is specifically focused on NGLY1 Deficiency, an ultra-rare genetic disease first reported in 2012. We represented this structured review as a knowledge graph and then stored this knowledge graph in a Neo4j database to simplify dissemination, querying and visualization of the network. Relative to free-text, this structured review better promotes the principles of findability, accessibility, interoperability and reusability (FAIR). In collaboration with domain experts in NGLY1 Deficiency, we demonstrate how this resource can improve the efficiency and comprehensiveness of hypothesis generation. We also developed a read–write interface that allows domain experts to contribute FAIR structured knowledge to this community resource. In contrast to traditional free-text review articles, this structured review exists as a living knowledge graph that is curated by humans and accessible to computational analyses. Finally, we have generalized this workflow into modular and repurposable components that can be applied to other domain areas. This NGLY1 Deficiency-focused network is publicly available at http://ngly1graph.org/.Availability and implementationDatabase URL: http://ngly1graph.org/. Network data files are at: https://github.com/SuLab/ngly1-graph and source code at: https://github.com/SuLab/bioknowledge-reviewer.Contact asu@scripps.edu

Highlights

  • Science progresses via an iterative loop between hypothesis generation, experimentation and interpretation

  • We explored the use of knowledge graphs as structured review articles to identify plausible regulatory mechanisms to explain these observations

  • To identify plausible potential mechanisms to explain this observation and others like it, we iteratively constructed a knowledge graph that focused on information relevant to the NGLY1 gene, NGLY1 Deficiency and aquaporins

Read more

Summary

Introduction

Science progresses via an iterative loop between hypothesis generation, experimentation and interpretation. Interpretation and generation of hypothesis relies on putting new data in context with existing relevant knowledge. Researchers typically need to access the relevant knowledge to their research question and hypothesis. In the context of hypothesis generation, reviews are designed to collect all evidence that answers a specific question. These evidence focuses on information that directly relate to the research question, background knowledge specific to the research question domain such as a disease and experimental data. All these different data and knowledge are synthesized from structured distributed knowledge bases or unstructured scientific papers and experimental datasets. The community does not benefit from the full value of review articles for hypothesis generation

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call