Abstract

Small non-coding RNAs (ncRNAs) are short non-coding sequences involved in gene regulation in many biological processes and diseases. The lack of a complete comprehension of their biological functionality, especially in a genome-wide scenario, has demanded new computational approaches to annotate their roles. It is widely known that secondary structure is determinant to know RNA function and machine learning based approaches have been successfully proven to predict RNA function from secondary structure information. Here we show that RNA function can be predicted with good accuracy from a lightweight representation of sequence information without the necessity of computing secondary structure features which is computationally expensive. This finding appears to go against the dogma of secondary structure being a key determinant of function in RNA. Compared to recent secondary structure based methods, the proposed solution is more robust to sequence boundary noise and reduces drastically the computational cost allowing for large data volume annotations. Scripts and datasets to reproduce the results of experiments proposed in this study are available at: https://github.com/bioinformatics-sannio/ncrna-deep.

Highlights

  • Recent advances in whole transcriptome sequencing have led to the discovery of novel transcribed elements with no apparent functional or protein-coding potential

  • Recent advances in high throughput technologies have allowed the discovery of a large number of novel transcript elements, called ncRNAs, and previously considered to lack functional potential. ncRNAs represent a very heterogeneous group of RNA in terms of their length, biogenesis, and functions which can be divided into long non-coding RNAs and short noncoding RNAs

  • Due to their complex nature, great challenges still remain for reaching a full comprehension of ncRNAs, demanding the development of computational approaches able to detect and annotate their biological functions according to family identity

Read more

Summary

Introduction

Recent advances in whole transcriptome sequencing have led to the discovery of novel transcribed elements with no apparent functional or protein-coding potential. Several classes of non-coding RNAs (ncRNAs) have been discovered in the last years, stressing on their importance as regulators of cellular development and differentiation. NcRNAs are classified into two major classes according to their length, short (200 nucleotides) ncRNAs. It is common knowledge that ncRNAs regulate gene expression both on post-transcriptional and transcriptional levels, affect the organization, and modification of chromatin, or have catalytic functions [2]. Short ncRNAs include ribosomal RNAs (rRNAs) and transfer RNAs (tRNAs) involved in mRNA translation, small nuclear RNAs (snRNAs) involved in splicing, small nucleolar RNAs (snoRNAs) involved in the modification of rRNAs, and microRNAs (miRNAs) involved in targeted translational repression and gene silencing. The functional characterization of ncRNAs on a wide scale is currently one of the main challenges of modern genome biology as, compared to protein coding RNAs, they are usually less conserved and expressed

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call