Abstract

BackgroundLimited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim to uncover the genetic and molecular causes of more rare diseases and ultimately to identify effective therapeutic solutions. A clear understanding of current research efforts, knowledge/research gaps, and funding patterns as scientific evidence is crucial to systematically accelerate the pace of research discovery in rare diseases, which is an overarching goal of this study.MethodsTo semantically represent NIH funding data for rare diseases and advance its use of effectively promoting rare disease research, we identified NIH funded projects for rare diseases by mapping GARD diseases to the project based on project titles; subsequently we presented and managed those identified projects in a knowledge graph using Neo4j software, hosted at NCATS, based on a pre-defined data model that captures semantics among the data. With this developed knowledge graph, we were able to perform several case studies to demonstrate scientific evidence generation for supporting rare disease research discovery.ResultsOf 5001 rare diseases belonging to 32 distinct disease categories, we identified 1294 diseases that are mapped to 45,647 distinct, NIH-funded projects obtained from the NIH ExPORTER by implementing semantic annotation of project titles. To capture semantic relationships presenting amongst mapped research funding data, we defined a data model comprised of seven primary classes and corresponding object and data properties. A Neo4j knowledge graph based on this predefined data model has been developed, and we performed multiple case studies over this knowledge graph to demonstrate its use in directing and promoting rare disease research.ConclusionWe developed an integrative knowledge graph with rare disease funding data and demonstrated its use as a source from where we can effectively identify and generate scientific evidence to support rare disease research. With the success of this preliminary study, we plan to implement advanced computational approaches for analyzing more funding related data, e.g., project abstracts and PubMed article abstracts, and linking to other types of biomedical data to perform more sophisticated research gap analysis and identify opportunities for future research in rare diseases.

Highlights

  • Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists

  • We focused on the branch of “Disease or Disorder” (MONDO: 0000001) in Monarch Disease Ontology (MONDO) obo file [19], from where we extracted 32 root disease categories, including congenital abnormality, acute disease, disorder involving pain, serpinopathy, psychiatric disorder, visceral myopathy, and post-infectious disorder, etc. (A complete list of 32 root disease categories can be found in Additional file 1) We mapped those 5001 Genetic and Rare Diseases (GARD) diseases to the 32 root categories by iteratively searching the MONDO disease hieratical tree

  • 799 GARD diseases belonged to a single MONDO category, while most GARD diseases were mapped to multiple MONDO disease categories

Read more

Summary

Introduction

Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. There are an estimated 25–30 million Americans that are affected by one of approximately 7000 different rare diseases, Zhu et al Orphanet Journal of Rare Diseases (2021) 16:483 most of which are poorly understood with unclear underlying biological mechanisms This knowledge gap leads to challenges for patients, clinicians, and investigators. Patients affected by a rare disease experience delays in diagnosis, as well as a lack of available treatments, clinicians often have limited clinical knowledge and experience impedes their clinical decision making, and investigators struggle with limited patient data and sparse funding for research across most rare diseases [1] To help address these challenges, we proposed a detailed analysis of research funding data to (1) enhance understanding of the current funding situation and potential funding opportunities in rare diseases, and (2) identify gaps among current research activities in rare diseases that may be primed for new research.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call