Abstract

MicroRNAs are a class of small RNAs involved in post-transcriptional gene silencing with roles in disease and development. Many computational tools have been developed to identify novel microRNAs. However, there have been no attempts to predict cleavage sites for Drosha from primary sequence, or to identify cleavage sites using deep neural networks. Here, we present DeepMirCut, a recurrent neural network-based software that predicts both Dicer and Drosha cleavage sites. We built a microRNA primary sequence database including flanking genomic sequences for 34,713 microRNA annotations. We compare models trained on sequence data, sequence and secondary structure data, as well as input data with annotated structures. Our best model is able to predict cuts within closer average proximity than results reported for other methods. We show that a guanine nucleotide before and a uracil nucleotide after Dicer cleavage sites on the 3′ arm of the microRNA precursor had a positive effect on predictions while the opposite order (U before, G after) had a negative effect. Our analysis was also able to predict several positions where bulges had either positive or negative effects on the score. We expect that our approach and the data we have curated will enable several future studies.

Highlights

  • MicroRNAs are a conserved class of endogenous small RNAs around 22 nucleotides in length

  • Because plant and animal microRNA biogenesis is very different (Kurihara and Watanabe, 2004; Axtell et al, 2011), and because more data is available for metazoa to train deep learning models, we focused this current study on precursors from metazoan species having both mature microRNAs (5 and 3′) annotated in miRBase, which consists of 11,296 records

  • When comparing Drosha cleavage site prediction, we found that DeepMirCut predicted Drosha cut on the 5′ arm (DR5) sites for mirtrons with a position shift error (PSE) of 3.56 and predicted DR5 sites for canonical microRNA with a PSE of 2.57

Read more

Summary

Introduction

MicroRNAs (miRs) are a conserved class of endogenous small RNAs around 22 nucleotides (nt) in length. Nucleotide positions 2 through 8 on the mature microRNA, called the seed sequence, help direct the sequence-specific activity of the RNA-induced silencing complex (RISC), where it binds to a complementary strand on the 3′ UTR of an mRNA transcript. MicroRNA may bind to target sites along CDS of RNA (Schnall-Levin et al, 2010; Zhang et al, 2018). Several CDS target-sites are known to suppress MicroRNA regulatory activity by acting as microRNA sponges (Ebert et al, 2007), including circular RNAs (Hansen et al, 2013), and long noncoding RNAs (Cheng and Lin, 2013). Other target-sites such as those found on a lncRNA called Cyrano can lead to target-directed miRNA degradation (TDMD) (Kleaveland et al, 2018; Han et al, 2020; Shi et al, 2020). ZSWIM8 ubiquitin ligase plays a role in TDMD by polyubiquinating Argonaut, which

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.