Abstract

Genetic and molecular analysis of rare disease is made difficult by the small numbers of affected patients. Phenotypic comorbidity analysis can help rectify this by combining information from individuals with similar phenotypes and looking for overlap in terms of shared genes and underlying functional systems. However, few studies have combined comorbidity analysis with genomic data. We present a computational approach that connects patient phenotypes based on phenotypic co-occurence and uses genomic information related to the patient mutations to assign genes to the phenotypes, which are used to detect enriched functional systems. These phenotypes are clustered using network analysis to obtain functionally coherent phenotype clusters. We applied the approach to the DECIPHER database, containing phenotypic and genomic information for thousands of patients with heterogeneous rare disorders and copy number variants. Validity was demonstrated through overlap with known diseases, co-mention within the biomedical literature, semantic similarity measures, and patient cluster membership. These connected pairs formed multiple phenotype clusters, showing functional coherence, and mapped to genes and systems involved in similar pathological processes. Examples include claudin genes from the 22q11 genomic region associated with a cluster of phenotypes related to DiGeorge syndrome and genes related to the GO term anterior/posterior pattern specification associated with abnormal development. The clusters generated can help with the diagnosis of rare diseases, by suggesting additional phenotypes for a given patient and potential underlying functional systems. Other tools to find causal genes based on phenotype were also investigated. The approach has been implemented as a workflow, named PhenCo, which can be adapted to any set of patients for which phenomic and genomic data is available. Full details of the analysis, including the clusters formed, their constituent functional systems and underlying genes are given. Code to implement the workflow is available from GitHub.

Highlights

  • Rare diseases are defined as those that affect fewer than 5 people per 10,000 [1]

  • By mapping genes to these phenotypes, based on data from the same patients, we were able to detect related genes and functional systems, such as genes mapping to the 22q11 genomic region underlying a cluster of phenotypes related to DiGeorge syndrome

  • For the 1,758 phenotypes presented by DECIPHER patients, we obtained a total of 36,709 Phenotype-Phenotype pairs

Read more

Summary

Introduction

Correct diagnosis based on observed clinical features is notoriously difficult for rare diseases, as they often show complex or unusual phenotype profiles and there are, by definition, few patients to compare. As a result, their diagnosis can be delayed for many years, with patients being repeatedly passed between specialists [2, 3]. It is important to develop methodology to help with their medical and molecular diagnosis To this end multiple personal medicine initiatives have been implemented that obtain and combine patient-related genomic data with environmental and lifestyle information for RDs [4]. It has been used to annotate diseases for both OMIM and Orphanet

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.