Massive computational identification of somatic variants in exonic splicing enhancers using The Cancer Genome Atlas.

Kousuke Tanimoto,Tomoki Muramatsu,Johji Inazawa

doi:10.1002/cam4.2619

Abstract

Owing to the development of next‐generation sequencing (NGS) technologies, a large number of somatic variants have been identified in various types of cancer. However, the functional significance of most somatic variants remains unknown. Somatic variants that occur in exonic splicing enhancer (ESE) regions are thought to prevent serine and arginine‐rich (SR) proteins from binding to ESE sequence motifs, which leads to exon skipping. We computationally identified somatic variants in ESEs by compiling numerous open‐access datasets from The Cancer Genome Atlas (TCGA). Using somatic variants and RNA‐seq data from 9635 patients across 32 TCGA projects, we identified 646 ESE‐disrupting variants. The false positive rate of our method, estimated using a permutation test, was approximately 1%. Of these ESE‐disrupting variants, approximately 71% were located in the binding motifs of four classical SR proteins. ESE‐disrupting variants occurred in proportion to the number of somatic variants, but not necessarily in the specific genes associated with the biological processes of cancer. Existing bioinformatics tools could not predict the pathogenicity of ESE‐disrupting variants identified in this study, although these variants could cause exon skipping. We demonstrated that ESE‐disrupting nonsense variants tended to escape nonsense‐mediated decay surveillance. Using integrated analyses of open access data, we could specifically identify ESE‐disrupting variants. We have generated a powerful tool, which can handle datasets without normal samples or raw data, and thus contribute to reducing variants of uncertain significance because our statistical approach only uses the exon‐junction read counts from the tumor samples.

Highlights

Owing to the rapid progress of ‐generation sequencing (NGS) technologies, an enormous amount of omics data, across every type of cancer, has been analyzed and sharedKousuke Tanimoto and Tomoki Muramatsu contributed to this work.through public databases
We hypothesized that exonic splicing enhancer (ESE)‐disrupting variants could be identified massively by compiling somatic variant and gene expression data obtained from the public database The Cancer Genome Atlas (TCGA)
Of the 156 794 somatic variants analyzed in this study, 0.41% (0%‐0.72%) were identified as ESE‐disrupting variants

Summary

| INTRODUCTION

Owing to the rapid progress of ‐generation sequencing (NGS) technologies, an enormous amount of omics data, across every type of cancer, has been analyzed and shared. These omics data, including somatic variants, whole transcriptome data, and DNA methylation profiles, have been associated with clinical information and utilized to classify cancer types based on omics profiles and explore molecular targets for therapeutics. Mort et al predicted that exonic variants disrupt splicing, using a machine learning approach.[12] These studies integrated analyses of genome and transcriptome datasets from different individuals. Transcriptome information is important to elucidate splicing regulation Given these facts, we hypothesized that ESE‐disrupting variants could be identified massively by compiling somatic variant and gene expression data obtained from the public database The Cancer Genome Atlas (TCGA). | 7373 computationally identified somatic variants in ESEs using a variety of population genomics approaches and numerous open access datasets from TCGA

| MATERIALS AND METHODS

| RESULTS

Findings

| DISCUSSION

CONFLICTS OF INTEREST

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Massive computational identification of somatic variants in exonic splicing enhancers using The Cancer Genome Atlas.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cancer medicine

Lead the way for us

Journal: Cancer medicine	Publication Date: Oct 21, 2019
License type: CC BY 4.0

Similar Papers

Differential Allele-Specific Expression Uncovers Breast Cancer Genes Dysregulated by Cis Noncoding Mutations.
Pawel F Przytycki ... Mona Singh
Cell Systems | VOL. 10
Pawel F Przytycki, et. al.Pawel F Przytycki ... Mona Singh
01 Feb 2020
Cell Systems | VOL. 10

Genomic and transcriptomic somatic alterations of hepatocellular carcinoma in non-cirrhotic livers
Zachary L Skidmore ... Obi L Griffith
Cancer Genetics | VOL. 264-265
Zachary L Skidmore, et. al.Zachary L Skidmore ... Obi L Griffith
30 Apr 2022
Cancer Genetics | VOL. 264-265

Rational Design of Antisense Oligomers to Induce Dystrophin Exon Skipping
Chalermchai Mitrpant ... Steve D Wilton
Molecular Therapy | VOL. 17
Chalermchai Mitrpant, et. al.Chalermchai Mitrpant ... Steve D Wilton
01 Aug 2009
Molecular Therapy | VOL. 17

Leveraging Spatial Variation in Tumor Purity for Improved Somatic Variant Calling of Archival Tumor Only Samples.
Rebecca F Halperin ... Winnie S Liang
Frontiers in Oncology | VOL. 9
Rebecca F Halperin, et. al.Rebecca F Halperin ... Winnie S Liang
20 Mar 2019
Frontiers in Oncology | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Massive computational identification of somatic variants in exonic splicing enhancers using The Cancer Genome Atlas.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cancer medicine