Refinement of the clinical variant interpretation framework by statistical evidence and machine learning

Atsushi Takata,Kohei Hamanaka,Naomichi Matsumoto

doi:10.1016/j.medj.2021.02.003

Atsushi Takata, Kohei Hamanaka + Show 1 more

Open Access

https://doi.org/10.1016/j.medj.2021.02.003

Copy DOI

Journal: Med	Publication Date: Mar 11, 2021
Citations: 2	License type: publisher-specific-oa

Affiliation: Yokohama City University

Abstract

Although the American College of Medical Genetics andGenomics/Association for Molecular Pathology (ACMG/AMP) guidelines for variant interpretation are used widely in clinical genetics, there is room for improvement of these knowledge-based guidelines. Statistical assessment of average deleteriousness of start-lost, stop-lost, and in-frame insertion and deletion (indel) variants and extraction of deleterious subsets was performed, being informed by proportions of rare variants in the general population of the Genome Aggregation Database (gnomAD). A machine learning-based model scoring the pathogenicity of start-lost variants (the PoStaL model) was constructed by predicting possible translation initiation sites on transcripts by deep learning and training a random forest on known pathogenic and likely benign variants. The proportion of rare variants was highest in stop-lost variants, followed by in-frame indels and start-lost variants, suggesting that the criteria in the ACMG/AMP guidelines assigning PVS (pathogenic very strong) to start-lost variants and PM (pathogenic moderate) to stop-lost and in-frame indel variants would not be appropriate. Regarding deleterious subsets, stop-lost variants introducing extensions of more than 30 amino acids and in-frame indels computationally predicted to be damaging are enriched for rare and known pathogenic variants. For start-lost variants, we developed the PoStaL model, which outperforms existing tools. We also provide comprehensive lists of the PoStaL scores for start-lost variants and the length of extended amino acids by stop-lost variants. Our study could contribute to refinement of the ACMG/AMP guidelines, provides resources for future investigation, and provides an example of how to improve knowledge-based frameworks by data-driven approaches. The study was supported by grants from the Japan Agency for Medical Research and Development (AMED) and the Japan Society for the Promotion of Science (JSPS).

Full Text