There is an old joke about a man who searches diligently under a lamp-post at night. He explains to passer-by that he has lost his keys. “Did you lose them under the lamp-post?” “No,” he says. “Then why are you looking under the lamp-post?” “Because the light is better here.”
Are we currently looking for the keys to the secrets of RNA under the lamp-post? By revisiting the first issues of RNA published in 1995, it is clear that many articles tested hypotheses based on mechanistic models built on the determination of secondary and tertiary structures of RNA. In a perspective piece in the first issue, Olke Uhlenbeck opens with the statement, “A basic tenet of the emerging RNA religion is that most, if not all, RNA molecules fold into well-defined three-dimensional structures.” Many of us are still true believers of this RNA religion, but it seems that the focus of our current research has changed from deciphering the consequence of RNA structure to trying to understand the biological significance of the less complicated primary sequence of RNA. This prompts the question, have we changed focus merely because it is too technically challenging to carry out experiments based on RNA structure within our current cell models or multicellular organisms?
Today we are building models based on advanced in vivo analyses, and even though it is sometimes overwhelming, we are getting better at using information available from next generation sequencing (NGS). Twenty years ago in the era Before Omics (BO) and a well-developed World Wide Web, the thought of whole genome and transcriptome analysis was just a wild dream even to carry out using the simplest unicellular organism. In the '90s, almost all of the RNA–RNA and RNA–protein interaction studies were done in vitro, often relying on in vitro transcribed RNA. A big headache was the constant fear of ending up in “alternative RNA structures hell,” and based on recommendations from biochemistry gurus, including Uhlenbeck, we applied all sorts of tricks in attempts to reconstruct proper RNA interactions envisioned to form under native conditions. The detailed RNA structures derived from applying both experimental and computational tools, enabled advanced tertiary RNA structures of very complicated molecules to be resolved, a good example being the ribozyme. Thereby the catalytic activity of this molecule could be explained in detail.
At the beginning of the 21st century, just a few years after the birth of the journal RNA, the world of gene-regulating small non-coding RNAs was all of a sudden uncovered and became visible. I think it is safe to say that the most prominent breakthrough in the field of RNA biology during the past 20 years is the discovery of microRNAs (miRNAs). The beginning of this era got a jump start based on work relying on the nematode model when it was discovered that experimentally introduced double-stranded RNA (dsRNA) had a potent interfering effect on the transcription of endogenous mRNA, even though the same phenomenon had been discovered in plants years earlier. Initially it was a mystery how man-made dsRNA could have the effect of knocking down the expression of specific genes. Was it just a cheap way to make gene deletions, invented by God just to keep us scientists happy? Eight years after their first publication on RNA interference in C. elegans, Andrew Fire and Craig Mellow received the 2006 Nobel Prize in Physiology or Medicine. As a Swede, I cannot help being proud of living in the country that hosts this prestigious prize for scientists, and to see scientific achievements being celebrated at big and fancy parties, particularly if they are awarded to people in the RNA field. I can also assure everyone that the prize is celebrated at one hell of party!
As it turns out, miRNAs are the endogenous interference molecules used to regulate gene expression in multicellular organisms and the increasing knowledge about these small RNAs is having a huge impact on the entire field of molecular biology. In retrospect, it is puzzling how we managed to miss these molecules for so many years. How many of us unknowingly cut off and removed the bottom of the gels containing these miRNAs, just because we did not understand what they were and that they made the gels look ugly? It also makes me ponder how many “other” things we choose to ignore in our eagerness to simplify the mechanisms of gene regulation into generally accepted pathways.
The structure of the typical pre-miRNA is a fairly simple, short stem–loop structure, so much easier to comprehend than the complicated tertiary RNA structure of the typical ribozyme. By the way, Thomas Cech and Sidney Altman were awarded the Nobel Prize in chemistry for their discovery that RNA molecules with a certain structure have enzyme-like catalytic properties. Details regarding the maturation process of miRNAs have now been worked out to the level where it is presented in textbooks for undergraduates. However, at the molecular level, the target recognition process is to a large extent an enigma.
The theory is simple: 7 or 8 nucleotide (nt) complementarity in the 5′ half of the 21 nt long mature miRNA to a sequence in the 3′ untranslated region of an mRNA is enough to inhibit translation. Inhibition is achieved either by mRNA degradation, inhibition of translation, or a combination of both. In practice, in each organism there are thousands of different miRNAs identified, most of them capable of binding to at least hundreds of potential targets. The small RNA-seq and NGS technology have made it possible to identify miRNAs on a global basis and in parallel with the expression of potential targets. This is, however, based on the assumption that all target RNAs are accessible as naked single-stranded molecules, free-form competing interacting proteins, and RNA sequences. For decades we have known that this is not the case, all RNA molecules are covered by interacting proteins as well as both intra- and intermolecular interactions between RNAs. Still we pretty much ignore this fact when it comes to miRNA target predictions, simply because it makes things too complicated.
Recent transcriptome analyses have provided surprising and exciting insights revealing that in large part, the entire eukaryotic genome is transcribed. New non-coding RNAs are being identified as we speak by the growing number of NGS sequencing efforts all over the world. Long non-coding RNA (lncRNA) is the group of molecules in fashion at the moment. However, lncRNAs are designated merely based on their length (>200 nt). Without knowing more about these RNAs, it seems risky to keep them bagged in a group and it may even be misleading. Often lncRNAs exhibit little sequence conservation and are expressed at low levels. This should not be interpreted to mean that they are not important. Compared to DNA or proteins, RNA is a molecule with great potential in its plasticity, which makes it a great actor in the cellular interactome. I foresee that non-coding RNAs, expressed both independently and within genes, will turn out to have an even more prominent role in gene regulatory events.
With these discoveries, the determination of RNA structure is once again becoming a critical parameter required for the understanding of biological function. RNA modifications as well as multiple protein and ncRNA interactions together with co-transcriptional folding, needs to be taken into account. However, accurately predicting folded RNA structures still represents a substantial challenge. The available computational tools accessible to the non-specialized RNA researcher have in essence not changed during the past 20 years, and the same can generally be said for the experimental approaches routinely used for structural analysis. This does not mean that the existing tools are useless, but the field would benefit greatly if they were to go through the same type of metamorphosis as Sanger sequencing did for primary sequence analysis.
We have also reached a point where species-specific regulation is becoming increasingly evident, and as a consequence, we need accept the existence of unique species specific pathways. Thus, when we use different organisms as model systems, we should be aware that information obtained by studying the complexity of the mouse brain may not provide transferable information relevant for understanding our own brain. The era of discovering general pathways and mechanisms used by most organisms may have passed and we have to turn to the importance of fine-tuning. Tissue-specific transcriptome expression is becoming increasingly accepted and with modern techniques this is fairly easy to do, even if we do not always know what the results mean. More of a challenge is single-cell transcriptome analysis and even more difficult sub-cellular RNA analysis. As always, detecting does not mean understanding function, so this is the beginning rather than the end of what will likely be a long journey of exploration. It is interesting to note that in 2014 almost 700 articles were published discussing long non-coding RNAs. By contrast, in 1995 the number of published articles was 54. With the number of publications just in our immediate field of research interest increasing exponentially, we are approaching difficult times in selecting what is important. Data-mining is, therefore, itself becoming an established field of research.
In the life After Omics (AO) we have to find our way back to study specific structures, mechanisms, and pathways, and make individual RNAs into our pet molecules again, in ways similar to the days when the journal RNA arrived in the mail as an analog paperbound magazine, something that gave great pleasure to have in your hand once a month. Those were the days when you were certain that every article published in RNA was at least glanced at by all the members of the RNA Society.