Abstract

CNN and LSTM have proven their ability in feature extraction and natural language processing, respectively. So, we tried to use their ability to process the language of RNAs, i.e., predicting sequence of microRNAs using the sequence of mRNA. The idea is to extract the features from sequence of mRNA using CNN and use LSTM network for prediction of miRNA. The model has learned the basic features such as seed match at first 2-8 nucleotides starting at the 5' end and counting toward the 3' end. Also, it was able to predict G-U wobble base pair in seed region. While validating on experimentally validated data, the model was able to predict on average 72 percent of miRNAs for specific mRNA and shows highest positive expression fold change of predicted targets on a microarray data generated using anti 25 miRNAs compare to other predicted tools. Codes are available at https://github.com/rajkumar1501/sequence-prediction-using-CNN-and-LSTMs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call