Abstract

OBJECTIVES:Single nucleotide variants (SNVs) are the most common type of genetic variation among humans. High-throughput sequencing methods have recently characterized millions of SNVs in several thousand individuals from various populations, most of which are benign polymorphisms. Identifying rare disease-causing SNVs remains challenging, and often requires functional in vitro studies. Prioritizing the most likely pathogenic SNVs is of utmost importance, and several computational methods have been developed for this purpose. However, these methods are based on different assumptions, and often produce discordant results. The aim of the present study was to evaluate the performance of 11 widely used pathogenicity prediction tools, which are freely available for identifying known pathogenic SNVs: Fathmn, Mutation Assessor, Protein Analysis Through Evolutionary Relationships (Phanter), Sorting Intolerant From Tolerant (SIFT), Mutation Taster, Polymorphism Phenotyping v2 (Polyphen-2), Align Grantham Variation Grantham Deviation (Align-GVGD), CAAD, Provean, SNPs&GO, and MutPred.METHODS:We analyzed 40 functionally proven pathogenic SNVs in four different genes associated with differences in sex development (DSD): 17β-hydroxysteroid dehydrogenase 3 (HSD17B3), steroidogenic factor 1 (NR5A1), androgen receptor (AR), and luteinizing hormone/chorionic gonadotropin receptor (LHCGR). To evaluate the false discovery rate of each tool, we analyzed 36 frequent (MAF>0.01) benign SNVs found in the same four DSD genes. The quality of the predictions was analyzed using six parameters: accuracy, precision, negative predictive value (NPV), sensitivity, specificity, and Matthews correlation coefficient (MCC). Overall performance was assessed using a receiver operating characteristic (ROC) curve.RESULTS:Our study found that none of the tools were 100% precise in identifying pathogenic SNVs. The highest specificity, precision, and accuracy were observed for Mutation Assessor, MutPred, SNP, and GO. They also presented the best statistical results based on the ROC curve statistical analysis. Of the 11 tools evaluated, 6 (Mutation Assessor, Phanter, SIFT, Mutation Taster, Polyphen-2, and CAAD) exhibited sensitivity >0.90, but they exhibited lower specificity (0.42-0.67). Performance, based on MCC, ranged from poor (Fathmn=0.04) to reasonably good (MutPred=0.66).CONCLUSION:Computational algorithms are important tools for SNV analysis, but their correlation with functional studies not consistent. In the present analysis, the best performing tools (based on accuracy, precision, and specificity) were Mutation Assessor, MutPred, and SNPs&GO, which presented the best concordance with functional studies.

Highlights

  • The term ‘‘differences in sex development’’ (DSD) refers to congenital conditions in which chromosomal, gonadal, or anatomical sex development is atypical (1)

  • The aim of the present study was to evaluate the performance of 11 widely used pathogenicity prediction tools, which are freely available for identifying known pathogenic single nucleotide variants (SNV): Fathmn, Mutation Assessor, Protein Analysis Through Evolutionary Relationships (Phanter), Sorting Intolerant From Tolerant (SIFT), Mutation Taster, Polymorphism Phenotyping v2 (Polyphen-2), Align Grantham Variation Grantham Deviation (AlignGVGD), CAAD, Provean, SNPs&GO, and MutPred

  • Dataset We analyzed 40 disease-causing SNVs in four different genes associated with DSD: 17b-hydroxysteroid dehydrogenase 3 (HSD17B3), steroidogenic factor 1 (NR5A1), androgen receptor (AR), and luteinizing hormone/chorionic gonadotropin receptor (LHCGR)

Read more

Summary

Introduction

The term ‘‘differences in sex development’’ (DSD) refers to congenital conditions in which chromosomal, gonadal, or anatomical sex development is atypical (1). They can be classified into three major categories: sex chromosome DSDs, 46,XX DSDs, and 46,XY DSDs (2). Most causes of DSDs are genetically determined, and several genes have been found Functional studies for disease association variants are often used, but are laborious and timeconsuming (4,5)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call