Insight into neutral and disease-associated human genetic variants through interpretable predictors.

Bastiaan A Van Den Berg,Tjaart A P De Beer,Marcel J T Reinders,Dick De Ridder

doi:10.1371/journal.pone.0120729

Abstract

A variety of methods that predict human nonsynonymous single nucleotide polymorphisms (SNPs) to be neutral or disease-associated have been developed over the last decade. These methods are used for pinpointing disease-associated variants in the many variants obtained with next-generation sequencing technologies. The high performances of current sequence-based predictors indicate that sequence data contains valuable information about a variant being neutral or disease-associated. However, most predictors do not readily disclose this information, and so it remains unclear what sequence properties are most important. Here, we show how we can obtain insight into sequence characteristics of variants and their surroundings by interpreting predictors. We used an extensive range of features derived from the variant itself, its surrounding sequence, sequence conservation, and sequence annotation, and employed linear support vector machine classifiers to enable extracting feature importance from trained predictors. Our approach is useful for providing additional information about what features are most important for the predictions made. Furthermore, for large sets of known variants, it can provide insight into the mechanisms responsible for variants being disease-associated.

Highlights

Over the last decade, many predictors have been developed to categorize human nonsynonymous SNPs as disease-associated or neutral [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
Interpretable Predictions of Disease-Associated Human Genetic Variants methods typically employ large sets of known neutral and disease-associated variants to learn how to separate both classes based on variant characteristics, i.e. features
The degree of sequence conservation is highly predictive for disease association of genetic variants

Summary

Introduction

Many predictors have been developed to categorize human nonsynonymous SNPs as disease-associated or neutral [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]. Such predictors can be used for identifying the relatively few disease-associated variants in human variation data, a type of data that is rapidly increasing due to the advances in whole genome sequencing techniques [17]. Several methods, among which the often used method SIFT, predict class labels by thresholding a single conservation-based feature

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PloS one	Publication Date: Mar 31, 2015
Citations: 17	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Insight into neutral and disease-associated human genetic variants through interpretable predictors.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Retraction Note: Detection and replication of epistasis influencing transcription in humans.
Gibran Hemani ... Grant W Montgomery
Nature | VOL. 596
Gibran Hemani, et. al.Gibran Hemani ... Grant W Montgomery
11 Aug 2021
Nature | VOL. 596

Human non-synonymous SNPs: server and survey.
V Ramensky
Nucleic Acids Research | VOL. 30
V RamenskyV Ramensky
01 Sep 2002
Nucleic Acids Research | VOL. 30

Genetic Association, Post-translational Modification, and Protein-Protein Interactions in Type 2 Diabetes Mellitus
Amitabh Sharma ... Dwaipayan Bharadwaj
Molecular & Cellular Proteomics | VOL. 4
Amitabh Sharma, et. al.Amitabh Sharma ... Dwaipayan Bharadwaj
01 Aug 2005
Molecular & Cellular Proteomics | VOL. 4

DigiTag assay for multiplex single nucleotide polymorphism typing with high success rate
Nao Nishida ... Katsushi Tokunaga
Analytical Biochemistry | VOL. 346
Nao Nishida, et. al.Nao Nishida ... Katsushi Tokunaga
31 Aug 2005
Analytical Biochemistry | VOL. 346

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Insight into neutral and disease-associated human genetic variants through interpretable predictors.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one