Abstract

Evidence has accumulated enough to prove non-coding RNAs (ncRNAs) play important roles in cellular biological processes and disease pathogenesis. High throughput techniques have produced a large number of ncRNAs whose function remains unknown. Since the accurate identification of ncRNAs family is helpful to the research of their function, it is of necessity and urgency to predict the family of each ncRNAs. Although several traditional excellent methods are applicable to predict the family of ncRNAs, their complex procedures or inaccurate performance remain major problems confronting us. The main idea of those methods is first to predict the secondary structure, and then identify ncRNAs family according to properties of the secondary structure. Unfortunately, the multi-step error superposition, especially the imperfection of RNA secondary structure prediction tools, maybe the cause of low accuracy. In this paper, a novel end-to-end method 'ncRFP' was proposed to complete the prediction task based on Deep Learning. Instead of predicting the secondary structure, ncRFP predicts the ncRNAs family by automatically extracting features from ncRNAs sequences. Compared with other methods, ncRFP not only simplifies the process but also improves accuracy. The source code of ncRFP can be available at https://github.com/linyuwangPHD/ncRFP.

Highlights

  • THE expression of protein-coding genes has been the focus of life studies for decades

  • The primary structure is the bases sequence. It seems that ncRNAs sequences are not as conservative as secondary/tertiary structures, the same family of ncRNAs contains unify seeds

  • The secondary structure refers to the planar structure formed by a combination of secondary structure elements with a variety of specific shapes through its own folding and base pairing within the sequence

Read more

Summary

INTRODUCTION

THE expression of protein-coding genes (messenger RNAs: mRNAs) has been the focus of life studies for decades. GraPPLE uses machine learning methods to predict ncRNAs family based on the secondary structure features. NRC currently represents the state-of-art method, where the secondary structure typical features are first extracted by Moss [16] and processed into one-hot code, and the convolutional neural network is employed to identify ncRNAs. RNAcon considers 20 graph features obtained from the predicted ncRNAs secondary structure and adopts an RF classifier. NRC currently represents the state-of-art method, where the secondary structure typical features are first extracted by Moss [16] and processed into one-hot code, and the convolutional neural network is employed to identify ncRNAs Those traditional methods are all required to get the secondary structure of ncRNAs at the beginning and identify ncRNAs family with the secondary structure features. It predigests the process and has the potential to improve the prediction accuracy

Data Collection and Progressing
Method
RESULT

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.