Abstract
Multiple mRNA isoforms of the same gene are produced via alternative splicing, a biological mechanism that regulates protein diversity while maintaining genome size. Alternatively spliced mRNA isoforms of the same gene may sometimes have very similar sequence, but they can have significantly diverse effects on cellular function and regulation. The products of alternative splicing have important and diverse functional roles, such as response to environmental stress, regulation of gene expression, human heritable, and plant diseases. The mRNA isoforms of the same gene can have dramatically different functions. Despite the functional importance of mRNA isoforms, very little has been done to annotate their functions. The recent years have however seen the development of several computational methods aimed at predicting mRNA isoform level biological functions. These methods use a wide array of proteo-genomic data to develop machine learning-based mRNA isoform function prediction tools. In this review, we discuss the computational methods developed for predicting the biological function at the individual mRNA isoform level.
Highlights
Cells can produce multiple mRNA isoforms from a single gene because of a post-transcriptional regulatory mechanism, known as alternative splicing (AS). mRNA isoform sequences from a single gene may differ in a few base pairs up to several exons/introns
The results from the human genome project revealed that most humans contain ~25,000 protein coding genes, which is remarkably close to the number of genes in a nematode, C. elegans (20,000 genes) and is less than that of rice (40,000 genes)—suggesting that organismal complexity cannot be explained merely by the number of genes
Despite the significant role of alternatively spliced mRNA isoforms in controlling organismal complexity, limited progress has been made towards annotating their functions
Summary
Cells can produce multiple mRNA isoforms from a single gene because of a post-transcriptional regulatory mechanism, known as alternative splicing (AS). mRNA isoform sequences from a single gene may differ in a few base pairs up to several exons/introns. This wealth of data at the mRNA isoform level provides evidence confirming the differential expression of mRNA isoforms under different conditions [15,16,17] Such evidence has led to the refinement and improvement in genome annotations by identifying new functions of genes attributed to an alternatively spliced mRNA isoform product which were previously unknown [14]. This is mainly because most functional data at the genomic level is analyzed for genes and not mRNA isoforms Because of this there are very few mRNA isoform pairs with functional information for developing models for mRNA III functional network. A major drawback of IIIDB is that it limits its prediction to PPIs existing in IntAct. Because of this, new functional annotations cannot be assigned at the gene or at the mRNA isoform level. The selection of negative set based on predicted subcellular locations, while better than selecting random non-positive set, is still biased and propagates “error of prediction”
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have