Abstract

As a kind of RNA modification, N7-methlguanosine (m7G) modification is vital for gene expression regulation, but it is costly and inefficient to identify m7G sites manually. Hence, developing a computational method with high accuracy for prediction of m7G sites is helpful for in-depth researching their mechanisms. In this work, inspired by current deep learning and natural language processing (NLP) technologies, a method named m7G-DLSTM is designed using long short-term memory network (LSTM) combined with fully-connected network. After evaluation of various features, single nucleotide binary code and nucleotide chemical property are utilized as feature encoding scheme. By comparison, Double-LSTM model based on two peptides split by center guanosine sites has better performance than Single-LSTM model based on whole peptides. Among different directions, the direction from edge to center got the highest performance in Double-LSTM models. Finally, the performance of m7G-DLSTM is measured with a specificity of 94.37%, a sensibility of 92.96% and an accuracy of 93.66%, which indicated that m7G-DLSTM can be a useful tool for prediction of human RNA N7-methlguanosine sites.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call