Abstract

BackgroundDomains are basic units of proteins, and thus exploring associations between protein domains and human inherited diseases will greatly improve our understanding of the pathogenesis of human complex diseases and further benefit the medical prevention, diagnosis and treatment of these diseases. Within a given domain-domain interaction network, we make the assumption that similarities of disease phenotypes can be explained using proximities of domains associated with such diseases. Based on this assumption, we propose a Bayesian regression approach named "domainRBF" (domain Rank with Bayes Factor) to prioritize candidate domains for human complex diseases.ResultsUsing a compiled dataset containing 1,614 associations between 671 domains and 1,145 disease phenotypes, we demonstrate the effectiveness of the proposed approach through three large-scale leave-one-out cross-validation experiments (random control, simulated linkage interval, and genome-wide scan), and we do so in terms of three criteria (precision, mean rank ratio, and AUC score). We further show that the proposed approach is robust to the parameters involved and the underlying domain-domain interaction network through a series of permutation tests. Once having assessed the validity of this approach, we show the possibility of ab initio inference of domain-disease associations and gene-disease associations, and we illustrate the strong agreement between our inferences and the evidences from genome-wide association studies for four common diseases (type 1 diabetes, type 2 diabetes, Crohn's disease, and breast cancer). Finally, we provide a pre-calculated genome-wide landscape of associations between 5,490 protein domains and 5,080 human diseases and offer free access to this resource.ConclusionsThe proposed approach effectively ranks susceptible domains among the top of the candidates, and it is robust to the parameters involved. The ab initio inference of domain-disease associations shows strong agreement with the evidence provided by genome-wide association studies. The predicted landscape provides a comprehensive understanding of associations between domains and human diseases.

Highlights

  • Domains are basic units of proteins, and exploring associations between protein domains and human inherited diseases will greatly improve our understanding of the pathogenesis of human complex diseases and further benefit the medical prevention, diagnosis and treatment of these diseases

  • A new protein domain called PAAD has been discovered to be associated with apoptosis, cancer, and autoimmune diseases [19], and a novel domain called G8 has been reported to be linked to polycystic kidney disease and non-syndromic hearing loss [20]

  • Inspired by the successes of these methods, we propose in this paper to infer associations between domains and human disease phenotypes based on the assumption that phenotypically similar diseases are caused by functionally related domains

Read more

Summary

Introduction

Domains are basic units of proteins, and exploring associations between protein domains and human inherited diseases will greatly improve our understanding of the pathogenesis of human complex diseases and further benefit the medical prevention, diagnosis and treatment of these diseases. Over the past few decades, remarkable success has been achieved for such traditional gene-mapping approaches as family-based linkage analysis [1,2] and populationbased association studies [3,4] in pinpointing genes that are responsible for human inherited diseases [5,6] These traditional methods are either only capable of linking diseases with genetic regions that typically contain dozens to hundreds of genes, or usually domains to the disease of interest [19,20,21]. A new protein domain called PAAD has been discovered to be associated with apoptosis, cancer, and autoimmune diseases [19], and a novel domain called G8 has been reported to be linked to polycystic kidney disease and non-syndromic hearing loss [20] Most of these discoveries have far been made with the assistance of protein sequence analysis and other experimental techniques. It would be helpful to develop computational methods to directly infer possible associations between domains and human diseases

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.