Abstract
IntroductionPrimary immunodeficiencies (PIDs) are a group of rare genetic disorders characterized by dysfunction of the immune system components. Early diagnosis and treatment are essential to prevent severe or life-threatening complications. PIDs are manifested by diverse clinical symptoms, posing challenges for accurate diagnosis. A key aspect of PID diagnosis is identifying specific amino acid substitutions in the proteins related with heritable diseases. In this study, we have developed classification sequence-structure-property relationships (SSPR) models for predicting the pathogenicity of amino acid substitutions (AAS) in 25 proteins associated with the most important and genetically studied PIDs and encoded genes: IL2RG, JAK3, RAG1, RAG2, ADA, DCLRE1C, CD40LG, WAS, ATM, STAT3, KMT2D, BTK, FOXP3, AIRE, FAS, ELANE, ITGB2, CYBB, G6PD, GATA2, STAT1, IFIH1, NLRP3, MEFV, and SERPING1.MethodsThe data on 4825 pathogenic and benign AASs in the selected proteins were extracted from ClinVar and gnomAD. SSPR models were created for each protein using the MultiPASS software based on the Bayesian algorithm and different levels of MNA (Multilevel Neighborhoods of Atoms) descriptors for the representation of structural formulas of protein fragments including AAS.ResultsThe accuracy of prediction was assessed through a 5-fold cross-validation and compared to other bioinformatics tools, such as SIFT4G, Polyphen2 HDIV, FATHMM, MetaSVM, PROVEAN, ClinPred, and Alpha Missense. The best SSPR models demonstrated high accuracy, with an average ROC AUC of 0.831 ± 0.037, a Balanced accuracy of (0.763 ± 0.034), MCC (0.457 ± 0.06), and F-measure (0.623 ± 0.07) across all genes, outperforming the most popular bioinformatics tools.ConclusionsThe best created SSPR models for the prediction of pathogenicity of amino acid substitutions related with PIDs have been implemented in a freely available web application SAV-Pred (Single Amino acid Variants Predictor, http://www.way2drug.com/SAV-Pred/), which may be a useful tool for medical geneticists and clinicians. The use of SAV-Pred for some clinical cases of PIDs are provided.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have