Critical assessment of missense variant effect predictors on disease-relevant variant data.

Ruchir Rastogi,Ryan Chung,Constantina Bakolitsa,Daniele Raimondi,Tobias Olenyi,Olivier Poch,Akash Kamandula,Fabrizio Pucci,Marianne Rooman,Thomas Weber,Kirsley Chennen,Gaia Andreoletti,Chang Li,Matthew Mort,Steven E Brenner,Predrag Radivojac,Burkhard Rost,Wim Vranken,Celine Marquet,Nilah M Ioannidis,Xiaoming Liu,Vikas Pejaver,Giulia Babbi,Dong-Wook Kim,Yisu Peng,Gabriel Cia,Junwoo Woo,Pier Luigi Martelli,Kyoungyeul Lee,Francois Ancien,Timothy Bergquist,David N Cooper,Changwon Keum,Rita Casadio,Castrense Savojardo,Sindy Li

doi:10.1101/2024.06.06.597828

Abstract

Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction. To explore a variety of settings that are relevant for different clinical and research applications, we assess performance within different subsets of the evaluation data and within high-specificity and high-sensitivity regimes. We find strong performance of many predictors across multiple settings. Meta-predictors tend to outperform their constituent individual predictors; however, several individual predictors have performance similar to that of commonly used meta-predictors. The relative performance of predictors differs in high-specificity and high-sensitivity regimes, suggesting that different methods may be best suited to different use cases. We also characterize two potential sources of bias. Predictors that incorporate allele frequency as a predictive feature tend to have reduced performance when distinguishing pathogenic variants from very rare benign variants, and predictors supervised on pathogenicity labels from curated variant databases often learn label imbalances within genes. Overall, we find notable advances over the oldest and most cited missense variant effect predictors and continued improvements among the most recently developed tools, and the CAGI Annotate-All-Missense challenge (also termed the Missense Marathon) will continue to assess state-of-the-art methods as the field progresses. Together, our results help illuminate the current clinical and research utility of missense variant effect predictors and identify potential areas for future development.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Critical assessment of missense variant effect predictors on disease-relevant variant data.

Abstract

Talk to us

Similar Papers

More From: bioRxiv : the preprint server for biology

Lead the way for us

Journal: bioRxiv : the preprint server for biology	Publication Date: Jun 8, 2024
Citations: 3

Similar Papers

Abstract LB152: The Critical Assessment of Genome Interpretation: A community experiment that informs use of methods for germline cancer variant impact prediction
Constantina Bakolitsa ... Gaia Andreoletti
Cancer Research | VOL. 82
Constantina Bakolitsa, et. al.Constantina Bakolitsa ... Gaia Andreoletti
15 Jun 2022
Cancer Research | VOL. 82

Abstract 3295: CAGI: The Critical Assessment of Genome Interpretation, a community experiment to evaluate phenotype prediction: implications for predicting impact of variants in cancer
Gaia Andreoletti ... Daniel Barsky
Cancer Research | VOL. 78
Gaia Andreoletti, et. al.Gaia Andreoletti ... Daniel Barsky
01 Jul 2018
Cancer Research | VOL. 78

Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges.
Vikas Pejaver ... Sean D Mooney
Human Mutation | VOL. 38
Vikas Pejaver, et. al.Vikas Pejaver ... Sean D Mooney
12 Jun 2017
Human Mutation | VOL. 38

Prioritizing genomic variants pathogenicity via DNA, RNA, and protein-level features based on extreme gradient boosting.
Maolin Ding ... Huiying Zhao
Human genetics | VOL. -
Maolin Ding, et. al.Maolin Ding ... Huiying Zhao
04 Apr 2024
Human genetics | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Critical assessment of missense variant effect predictors on disease-relevant variant data.

Abstract

Talk to us

Similar Papers

More From: bioRxiv : the preprint server for biology