Abstract

Substitutions that disrupt pre-mRNA splicing are a common cause of genetic disease. On average, 13.4% of all hereditary disease alleles are classified as splicing mutations mapping to the canonical 5′ and 3′ splice sites. However, splicing mutations present in exons and deeper intronic positions are vastly underreported. A recent re-analysis of coding mutations in exon 10 of the Lynch Syndrome gene, MLH1, revealed an extremely high rate (77%) of mutations that lead to defective splicing. This finding is confirmed by extending the sampling to five other exons in the MLH1 gene. Further analysis suggests a more general phenomenon of defective splicing driving Lynch Syndrome. Of the 36 mutations tested, 11 disrupted splicing. Furthermore, analyzing past reports suggest that MLH1 mutations in canonical splice sites also occupy a much higher fraction (36%) of total mutations than expected. When performing a comprehensive analysis of splicing mutations in human disease genes, we found that three main causal genes of Lynch Syndrome, MLH1, MSH2, and PMS2, belonged to a class of 86 disease genes which are enriched for splicing mutations. Other cancer genes were also enriched in the 86 susceptible genes. The enrichment of splicing mutations in hereditary cancers strongly argues for additional priority in interpreting clinical sequencing data in relation to cancer and splicing.

Highlights

  • As the cost of sequencing technologies is declining, the number of genomes and exomes sequenced is increasing, resulting in an expanding archive of genetic variation in both diseased and healthy individuals [1, 2]

  • When performing a comprehensive analysis of splicing mutations in human disease genes, we found that three main causal genes of Lynch Syndrome, MLH1, MSH2, and PMS2, belonged to a class of 86 disease genes which are enriched for splicing mutations

  • We found that a high fraction of the MLH1 coding mutations resulted in disrupted splicing

Read more

Summary

Introduction

As the cost of sequencing technologies is declining, the number of genomes and exomes sequenced is increasing, resulting in an expanding archive of genetic variation in both diseased and healthy individuals [1, 2]. Most tools used to determine the pathogenicity of variants rely on in silico methods aimed at deciphering protein features associated with the variant and fail to take into account the potential regulatory functions of sequences in gene processing mechanisms and expression [7]. Variants that alter the regulatory regions necessary for splicing typically result in the deletion of large portions of the coding sequence and generally result in a non-functional protein [8]. Among the reported sequence variants, splicing mutations located at the 50 and 30 canonical exon-intron boundaries, or splice sites, make up 13.4% of the disease-causing mutations reported in the Human Gene Mutation Database (HGMD) [9]. A recent re-analysis of 20 coding mutations located in exon 10 of MLH1, reveal a high proportion of previously uncharacterized ESM (17 of the 20 or 77%) [11]. Using the position dependence of splicing elements as a measure to infer disruptive splicing, it has recently been predicted that one-third of all disease-causing variants lead to aberrant splicing [12]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.