Abstract

Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants.

Highlights

  • Since the first successful identification of disease-causing variation From whole exome sequencing in 2010 [1], impressive advances have been made in the field of generation sequencing and its related analysis, with the aim of fulfilling the promise of whole exome (WES) and whole genome (WGS) sequencing for personalized medicine

  • In PhenomeNET Variant Predictor (PVP), we combine methods to determine whether a variant is pathogenic with information about the phenotypes in which a gene is known to be involved to identify candidate causative variants in WES and WGS data

  • PhenomeNET consists of a repository of gene-phenotype associations from human and model organisms, an ontology that integrates phenotypes across species, and a semantic similarity measure that determines the similarity between two sets of phenotypes

Read more

Summary

Introduction

Since the first successful identification of disease-causing variation From whole exome sequencing in 2010 [1], impressive advances have been made in the field of generation sequencing and its related analysis, with the aim of fulfilling the promise of whole exome (WES) and whole genome (WGS) sequencing for personalized medicine Such approaches have revolutionized our ability to identify the genetic underpinnings of disease as well as improve our capacity to stratify patient populations and diagnose them in a more accurate and timely manner [2]. In haploid insufficiency, a heterozygote with a loss of function allele may develop an abnormal phenotype [12] Given these phenomena, it is clear why finding the “needle in a stack of needles” [13] remains one of the key challenges in fully utilizing WES and WGS data for personalized medicine

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call