Abstract
Next-generation sequencing has revolutionized rare disease diagnostics, but many patients remain without a molecular diagnosis, particularly because many candidate variants usually survive despite strict filtering. Exomiser was launched in 2014 as a Java tool that performs an integrative analysis of patients’ sequencing data and their phenotypes encoded with Human Phenotype Ontology (HPO) terms. It prioritizes variants by leveraging information on variant frequency, predicted pathogenicity, and gene-phenotype associations derived from human diseases, model organisms, and protein–protein interactions. Early published releases of Exomiser were able to prioritize disease-causative variants as top candidates in up to 97% of simulated whole-exomes. The size of the tested real patient datasets published so far are very limited. Here, we present the latest Exomiser version 12.0.1 with many new features. We assessed the performance using a set of 134 whole-exomes from patients with a range of rare retinal diseases and known molecular diagnosis. Using default settings, Exomiser ranked the correct diagnosed variants as the top candidate in 74% of the dataset and top 5 in 94%; not using the patients’ HPO profiles (i.e., variant-only analysis) decreased the performance to 3% and 27%, respectively. In conclusion, Exomiser is an effective support tool for rare Mendelian phenotype-driven variant prioritization.
Highlights
Despite the tremendous advances brought by next-generation sequencing to the field of rareMendelian gene discovery and diagnostics, many challenges remain [1,2], and this is reflected in limited diagnostic yield
We evaluated Exomiser software performance using a real patient whole-exome dataset with known molecular diagnosis obtained from a number of next-generation sequencing studies that were conducted at Moorfields Eye Hospital and the University College London (UCL) Institute of Ophthalmology (London, UK) approximately over four years (2011–2015) [62]
Patient P85 had been solved with a homozygous frameshift truncation variant (Glu211Aspfs*13) in LCA5 gene; this homozygous variant had been detected only via manual inspection of the sequencing data via the Integrative Genomics Viewer (IGV)
Summary
Despite the tremendous advances brought by next-generation sequencing to the field of rareMendelian gene discovery and diagnostics, many challenges remain [1,2], and this is reflected in limited diagnostic yield. Several limitations of high-throughput sequencing still exist that may impact the diagnostic yield These include the yet incomplete coverage affecting especially short-read sequencing, the nonexistence of a de facto standard calling algorithm for copy-number variants (CNVs), or the challenges in filtering and interpreting short tandem repeats with no known disease association [4]. Another major reason why a disorder remains unsolved after undergoing next-generation sequencing is the complexity of the interpretation of the wealth of variants found, which is further hindered by the incomplete knowledge on gene functions [4]. More complex statistical frameworks that account for disease prevalence, genetic and allelic heterogeneity, inheritance mode, penetrance, and sampling variance in reference datasets have been recently suggested for a more effective frequency-based variant filtering [5,6]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.