Abstract

PurposeThe persistence of hypermutable CGN (CGG, CGA, CGC, CGU) arginine codons at high frequency suggests the possibility of negative selective pressure at these sites and that arginine codon usage could be a predictive indicator of human disease genes. MethodsWe analyzed arginine codons (CGN, AGG, AGA) from all canonical Ensembl protein coding gene transcripts before comparing the frequency of CGN codons between genes with and without human disease associations and with gnomAD constraint metrics. ResultsThe frequency of CGN codons among a gene’s total arginine codon count was higher in genes linked to syndromic autism spectrum disorder (ASD) compared with genes not associated with ASD. A comparison of genes annotated as dominant or recessive with control genes not matching either classification revealed a progressive increase in CGN codon frequency. Moreover, CGN frequency was positively correlated with a gene’s probability of loss-of-function intolerance (pLI) score and negatively correlated with observed-over-expected ratios for both loss-of-function and missense variants. ConclusionOur findings indicate that genes utilizing CGN arginine codons rather than AGG or AGA are more likely to underlie single-gene disorders, particularly for dominant phenotypes, and thus constitute candidate genes for the study of human genetic disease.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call