Abstract
BackgroundmRNA-Seq technology has revolutionized the field of transcriptomics for identification and quantification of gene transcripts not only at gene level but also at isoform level. Estimating the expression levels of transcript isoforms from mRNA-Seq data is a challenging problem due to the presence of constitutive exons.ResultsWe propose a novel algorithm (IsoformEx) that employs weighted non-negative least squares estimation method to estimate the expression levels of transcript isoforms. Validations based on in silico simulation of mRNA-Seq and qRT-PCR experiments with real mRNA-Seq data showed that IsoformEx could accurately estimate transcript expression levels. In comparisons with published methods, the transcript expression levels estimated by IsoformEx showed higher correlation with known transcript expression levels from simulated mRNA-Seq data, and higher agreement with qRT-PCR measurements of specific transcripts for real mRNA-Seq data.ConclusionsIsoformEx is a fast and accurate algorithm to estimate transcript expression levels and gene expression levels, which takes into account short exons and alternative exons with a weighting scheme. The software is available at http://bioinformatics.wistar.upenn.edu/isoformex.
Highlights
MRNA-Seq technology has revolutionized the field of transcriptomics for identification and quantification of gene transcripts at gene level and at isoform level
An important aspect of this complexity is the generation of multiple transcript isoforms from a single gene in a genomic locus, due to the use of alternative initiation and/or termination of transcription and alternative splicing of pre-mRNAs [2,3,4]
The results showed that the performance of IsoformEx was better than the other methods for estimating transcript expression levels, with the least error and the highest correlation coefficient
Summary
MRNA-Seq technology has revolutionized the field of transcriptomics for identification and quantification of gene transcripts at gene level and at isoform level. Estimating the expression levels of transcript isoforms from mRNA-Seq data is a challenging problem due to the presence of constitutive exons. The functional consequence of differential expression of alternative isoforms for some genes has been known, the advent of massive parallel sequencing technology has facilitated the study of transcript isoforms at genome-scale. The deep sequencing of cDNA fragments of 15 human tissue and cell line transcriptomes showed that 92-94% of human genes undergo alternative splicing [5]. The transcript variants are differentially expressed across different tissue/cell types, developmental stages and disease conditions. In order to study the gene function at isoform level, it is necessary to know the expression of each transcript in various physiological and disease conditions
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have