Abstract

Nearly 50 pathogenic genes and hundreds of pathogenic variants have been identified in monogenic autoinflammatory diseases (AIDs). Nonetheless, there are still many genes for which the pathogenic mechanisms are poorly understood, and the pathogenicity of many candidate variants needs to be determined. Monogenic AIDs are a group of rare genetic diseases characterized by inflammation as the phenotype. With the development of next-generation sequencing, pathogenic genes have been widely reported and used for clinical screening and diagnosis. The International Society for Systemic Autoinflammatory Diseases has recognized approximately 50 pathogenic genes and hundreds of related pathogenic variants in monogenic AIDs. We plan to investigate these pathogenic variants by conducting a variant burden analysis to determine whether or not there are consistent characteristics. We performed a variant burden analysis on the Genome Aggregation Database cohort using the currently reported genetic variants in monogenic AIDs, analyzing the enrichment of allelic signatures and deleterious predictions at the variants. Allelic signatures were extracted from Genome Aggregation Database, and the deleterious predictions were extracted from existing tools. The features obtained from the variant burden analysis were applied to the Random Forest model to classify the pathogenicity of novel mutations. Functional enrichment and network analysis of AID pathogenic genes have hinted at the possible involvement of unsuspected signals. The variant burden analysis demonstrated that the pathogenicity of a variant could not be reliably classified using only its allele frequency and deleterious predictions. However, variants of varying classifications of pathogenicity exhibited strikingly different patterns of the allelic signature in the upstream and downstream regions surrounding the variants. Furthermore, the distribution of deleterious variants surrounding the variants in the cohort varied significantly across pathogenicity categories. Finally, the cohort-based features extracted from the alleles were applied to the prediction of pathogenicity in monogenic AIDs, achieving superior prediction performance compared with other tools. The cohort-based features have potential applications across a more extensive variety of disease categories. The pathogenicity of a variant can be effectively classified on the basis of variant frequency and deleterious prediction of the allele in the cohort, and this information can be used to improve the accuracy of the current classification of the pathogenicity of the variant.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call