Abstract

DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380 000 associations between >16 000 genes and 13 000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/

Highlights

  • Biomedical sciences are facing an enormous increase of data available in public sources, in volume, and in nature

  • The information contained in DisGeNET is obtained from expert-curated databases, namely CTD [7], UniProt [17], Rat Genome Database (RGD) [18] and Mouse Genome Database (MGD) [19]

  • It contains data obtained from the scientific literature by different text mining approaches. These include data extracted from published peer-reviewed articles on Genome Wide Association Studies (GWAS) Genetics Association Database (GAD) [8], a literature-derived human gene-disease network (LHGDN) dataset, obtained by mining Entrez Gene’s GeneRIF database [20] using conditional random fields

Read more

Summary

Introduction

Biomedical sciences are facing an enormous increase of data available in public sources, in volume, and in nature (the so-called Biomedical Big Data). By the integrative exploitation of genomic, phenomic and environmental information, translational bioinformatics will enable a deeper understanding of disease mechanisms [2]. The availability of both, comprehensive knowledge sources on disease genes and tools that allow their analysis and exploitation, should lay the basis to achieve this goal. There are several resources that cover different aspects of our current knowledge on the genetic basis of human diseases [4,5,6,7,8,9,10,11]. DisGeNET is one of these resources [12, 13], whose aims are to cover all disease areas (Mendelian, complex and environmental diseases), with special care on the integration and standardization of data, and to provide open access on knowledge of genes associated to human diseases

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call