Abstract

Bioinformatic Challenges of Big Data in Non-Coding RNA Research

Highlights

  • Prior to the high-throughput sequencing techniques, computational programs were developed to search for new miRNAs based on attainable sequence data. These methods used one of the following approaches (Mendes et al, 2009): filterbased approaches, which identified small high-quality sets of conserved miRNA candidates; machine learning methods, which determined initial set of candidates with stem-loops structures, and target-centered approaches, which identify short conserved motifs in the 3′UTRs of protein-coding genes (Xie et al, 2005). Even though these algorithms were developed before the highthroughput sequencing era, they establish strong bases for bioinformatic analyses of big sequencing data; new nonprotein-coding RNA (ncRNA) and targets continue to be cataloged into many databases with sufficient annotations available to the public

  • High-throughput sequencing techniques and deep sequencing have offered much improved avenue for ncRNA discovery (Lu et al, 2005), by searching genomic sequences for evidence of hairpin structures and determine if sequencing read aligned to these structures mimic miRNA processing byproducts (Friedlander et al, 2008), or using a ­regularized

  • Because of the high sensitivity of the technique, the “raw” data will contain sequencing primers and contaminants which can potentially produce sequence bias that requires more sophisticated computational approaches to sieve out miRNA transcripts (Mendes et al, 2009) and cross-platform validations

Read more

Summary

Introduction

Prior to the high-throughput sequencing techniques, computational programs were developed to search for new miRNAs based on attainable sequence data. These methods used one of the following approaches (Mendes et al, 2009): filterbased approaches, which identified small high-quality sets of conserved miRNA candidates; machine learning methods, which determined initial set of candidates with stem-loops structures, and target-centered approaches, which identify short conserved motifs in the 3′UTRs of protein-coding genes (Xie et al, 2005).

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call