Abstract

Whole exome sequencing has altered the way in which rare diseases are diagnosed and disease genes identified. Hundreds of novel disease-associated genes have been characterized by whole exome sequencing in the past five years, yet the identification of disease-causing mutations is often challenging because of the large number of rare variants that are being revealed. Gene prioritization aims to rank the most probable candidate genes towards the top of a list of potentially pathogenic variants. A promising new approach involves the computational comparison of the phenotypic abnormalities of the individual being investigated with those previously associated with human diseases or genetically modified model organisms. In this review, we compare and contrast the strengths and weaknesses of current phenotype-driven computational algorithms, including Phevor, Phen-Gen, eXtasy and two algorithms developed by our groups called PhenIX and Exomiser. Computational phenotype analysis can substantially improve the performance of exome analysis pipelines.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-015-0199-2) contains supplementary material, which is available to authorized users.

Highlights

  • Whole exome sequencing has altered the way in which rare diseases are diagnosed and disease genes identified

  • This project has emerged from the Canadian FORGE (Finding of Rare Disease Genes) initiative, which has been able to identify disease-causing variants for 146 of the 264 disorders studied over a 2-year period, with up to 67 novel disease-associated genes being characterized [63]

  • We present a number of recently published tools that substantially improve the analysis of whole exome sequencing (WES) data by incorporating phenotypic features into their prioritization procedures, and compare their strengths and weaknesses

Read more

Summary

Website and command line

Prioritization based on a Random Forest score from combined deleteriousness scores (CAROL, LRT, MutationTaster, PhastCons, PhyloP, PolyPhen, SIFT), haploinsufficiency, and similarity of the gene to genes annotated with the input Human Phenotype Ontology (HPO) phenotypes as measured by sequence similarity, co-expression, and involvement in the same pathway or protein–protein interactions. Prioritization based on semantic similarity of each candidate gene to genes annotated with the input set of ontology terms taken from HPO, Mammalian Phenotype Ontology (MPO), Disease Ontology (DO), and Gene Ontology (GO)

All family VCF
All coding
Annovar filtered for Phevor analysis
Findings
Additional file
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.