Abstract

The assumption that RNA can be readily classified into either protein-coding or non-protein–coding categories has pervaded biology for close to 50 years. Until recently, discrimination between these two categories was relatively straightforward: most transcripts were clearly identifiable as protein-coding messenger RNAs (mRNAs), and readily distinguished from the small number of well-characterized non-protein–coding RNAs (ncRNAs), such as transfer, ribosomal, and spliceosomal RNAs. Recent genome-wide studies have revealed the existence of thousands of noncoding transcripts, whose function and significance are unclear. The discovery of this hidden transcriptome and the implicit challenge it presents to our understanding of the expression and regulation of genetic information has made the need to distinguish between mRNAs and ncRNAs both more pressing and more complicated. In this Review, we consider the diverse strategies employed to discriminate between protein-coding and noncoding transcripts and the fundamental difficulties that are inherent in what may superficially appear to be a simple problem. Misannotations can also run in both directions: some ncRNAs may actually encode peptides, and some of those currently thought to do so may not. Moreover, recent studies have shown that some RNAs can function both as mRNAs and intrinsically as functional ncRNAs, which may be a relatively widespread phenomenon. We conclude that it is difficult to annotate an RNA unequivocally as protein-coding or noncoding, with overlapping protein-coding and noncoding transcripts further confounding this distinction. In addition, the finding that some transcripts can function both intrinsically at the RNA level and to encode proteins suggests a false dichotomy between mRNAs and ncRNAs. Therefore, the functionality of any transcript at the RNA level should not be discounted.

Highlights

  • Numerous studies have demonstrated that the true catalog of RNAs encoded within the genome is more extensive and complex than previously thought

  • For instance, it has become apparent that the vast majority of the genome is transcribed, often in intricate networks of overlapping sense and antisense transcripts, many of which are alternatively spliced [1,4,5,6,7,8]

  • Despite an increasing number of long non-protein–coding RNAs (ncRNAs) having been shown to fulfill a diverse range of regulatory roles, the functions of the vast majority remain unknown and untested

Read more

Summary

Introduction

Numerous studies have demonstrated that the true catalog of RNAs encoded within the genome (the ‘‘transcriptome’’) is more extensive and complex than previously thought (reviewed in [1,2,3]). Despite an increasing number of long ncRNAs having been shown to fulfill a diverse range of regulatory roles (reviewed in [15,16]), the functions of the vast majority remain unknown and untested. While this is true of small RNAs to some extent, long ncRNAs—unlike their smaller counterparts—lack obvious features to allow a priori functional categorization or prediction. PLoS Computational Biology | www.ploscompbiol.org understand the nature of these challenges, the approaches used to distinguish noncoding from protein-coding are considered below

Strategies to Discriminate between ncRNAs and mRNAs
Bifunctional RNAs and the False Dichotomy
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.