Abstract

With considerable accumulation of RNA-Seq transcriptome data, we have extended our understanding about protein-coding gene transcript compositions. However, alternatively compounded patterns of human protein-coding gene transcripts would complicate gene expression data processing and interpretation. It is essential to exhaustively interrogate complex mRNA isoforms of protein-coding genes with an unified data resource. In order to investigate representative mRNA transcript isoforms to be utilized as transcriptome analysis references, we utilized GTEx data to establish a top-ranked transcript isoform expression data resource for human protein-coding genes. Distinctive tissue specific expression profiles and modulations could be observed for individual top-ranked transcripts of protein-coding genes. Protein-coding transcripts or genes do occupy much higher expression fraction in transcriptome data. In addition, top-ranked transcripts are the dominantly expressed ones in various normal tissues. Intriguingly, some of the top-ranked transcripts are noncoding splicing isoforms, which imply diverse gene regulation mechanisms. Comprehensive investigation on the tissue expression patterns of top-ranked transcript isoforms is crucial. Thus, we established a web tool to examine top-ranked transcript isoforms in various human normal tissue types, which provides concise transcript information and easy-to-use graphical user interfaces. Investigation of top-ranked transcript isoforms would contribute understanding on the functional significance of distinctive alternatively spliced transcript isoforms.

Highlights

  • With considerable accumulation of RNA-Seq transcriptome data, we have extended our understanding about protein-coding gene transcript compositions

  • While we believe that this MANE-select dataset would be an excellent resource for future precision medicine applications and functional genomic researches, on the other hands, there are still needs for thoroughly inspection on the tissue expression profiles of multiple alternatively spliced transcripts of human protein-coding genes

  • We did not attempt to discover novel alternatively spliced mRNA isoforms and we relied on the transcript annotations released by Genotype-Tissue Expression (GTEx)

Read more

Summary

Introduction

With considerable accumulation of RNA-Seq transcriptome data, we have extended our understanding about protein-coding gene transcript compositions. Numerous large-scale RNA-Seq transcriptome studies, such as the Genotype-Tissue Expression (GTEx)[9], Cancer Cell Line Encyclopedia (CCLE)[10], and The Cancer Genome Atlas (TCGA)[11], have accumulated massive quantities of human gene expression information in tissues and pathological conditions These studies provided us more information on the spliced transcript isoforms of protein-coding genes as well as more understanding on their expression profiles and translated protein products in human tissues and diseases. While we believe that this MANE-select dataset would be an excellent resource for future precision medicine applications and functional genomic researches, on the other hands, there are still needs for thoroughly inspection on the tissue expression profiles of multiple alternatively spliced transcripts of human protein-coding genes

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call