Abstract

Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC), is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting “disease map” network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

Highlights

  • Advances in sequencing technologies and collaborative, large-scale—omics and genome-wide association projects are providing investigators with overwhelming lists of candidate disease gene associations

  • Evolutionary Rate Covariation (ERC) signatures are broadly elevated between genes contributing to human diseases To determine the strength of ERC signatures between disease genes we interrogated a set of 310 Disease Gene Groupings (DGG), each containing at least 3 genes known to be associated with an Online Mendelian Inheritance in Man (OMIM)-annotated disease

  • We examined the ERC values between each pair of constituent genes in each DGG, while testing for statistically significant elevations in ERC as a group

Read more

Summary

Introduction

Advances in sequencing technologies and collaborative, large-scale—omics and genome-wide association projects are providing investigators with overwhelming lists of candidate disease gene associations. To more effectively decipher and prove candidate genes' roles in disease processes, computational tools have been created to both prioritize and place candidate genes into some functional context for more effective experimental validation. The primary methods used to create these networks rely on sophisticated algorithms that weigh certain biological features based on the query genes and sometimes userdictated parameters These parameters include Gene Ontology (GO) terms, genomic and proteomic study results (yeast two-hybrid, ChIP-seq, physical interactome datasets, protein structure comparisons, subcellular localization, tissue specific expressivity, etc.) and even literature mining techniques such as co-occurrence in PubMed abstracts [11]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call