Abstract

Variants of unknown/uncertain significance (VUS) pose a huge dilemma in current genetic variation screening methods and genetic counselling. Driven by methods of next generation sequencing (NGS) such as whole exome sequencing (WES), a plethora of VUS are being detected in research laboratories as well as in the health sector. Motivated by this overabundance of VUS, we propose a novel computational methodology, termed VariantClassifier (VarClass), which utilizes gene-association networks and polygenic risk prediction models to shed light into this grey area of genetic variation in association with disease. VarClass has been evaluated using numerous validation steps and proves to be very successful in assigning significance to VUS in association with specific diseases of interest. Notably, using VUS that are deemed significant by VarClass, we improved risk prediction accuracy in four large case-studies involving disease-control cohorts from GWAS as well as WES, when compared to traditional odds ratio analysis. Biological interpretation of selected high scoring VUS revealed interesting biological themes relevant to the diseases under investigation. VarClass is available as a standalone tool for large-scale data analyses, as well as a web-server with additional functionalities through a user-friendly graphical interface.

Highlights

  • Human genetic variation analysis has recently been influenced by generation sequencing (NGS) technologies in the form of whole-genome sequencing (WGS), whole-exome sequencing (WES) and multigene panels

  • For the first 2 steps in the pipeline, ClinVar[21] is used to extract known variants and genes associated with a disease direction of interest

  • We have developed, validated and applied VarClass, a novel computational framework suited for performing downstream analysis on genetic variation data derived from high-throughput methodologies, in order to provide a disease-related ranking score for Variants of unknown/uncertain significance (VUS)

Read more

Summary

Introduction

Human genetic variation analysis has recently been influenced by generation sequencing (NGS) technologies in the form of whole-genome sequencing (WGS), whole-exome sequencing (WES) and multigene panels. Human genetic variation screening is currently focussed towards detecting a pathogenic variant in targeted high-risk individuals with an increased likelihood for a specific inheritable disease, as suggested by their family history This approach is mainly suited to hereditary diseases such as inherited forms of breast and ovarian cancers, or certain types of “monogenic” disorders such as certain types of ataxias. Accepted methods for variant analysis are based on odds ratio analysis such as those performed in traditional GWAS studies in order to obtain associations between specific, “single” variants and a disease of interest One drawback of this methodology is that it relies on strict thresholds, namely, those variants that fall below odds ratio thresholds Pathogenicity related information and impact of individual variants in the context of disease is made available from databases like ClinVar[21], COSMIC22 and scoring schemes like CADD23, Polyphen[24] and SIFT25 which score the impact of the amino acid change caused by the genetic variation at the protein level

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call