Abstract

Long noncoding RNA (lncRNA) plays a crucial role in many critical biological processes and participates in complex human diseases through interaction with proteins. Considering that identifying lncRNA–protein interactions through experimental methods is expensive and time-consuming, we propose a novel method based on deep learning that combines raw sequence composition features, hand-designed features and structure features, called LGFC-CNN, to predict lncRNA–protein interactions. The two sequence preprocessing methods and CNN modules (GloCNN and LocCNN) are utilized to extract the raw sequence global and local features. Meanwhile, we select hand-designed features by comparing the predictive effect of different lncRNA and protein features combinations. Furthermore, we obtain the structure features and unifying the dimensions through Fourier transform. In the end, the four types of features are integrated to comprehensively predict the lncRNA–protein interactions. Compared with other state-of-the-art methods on three lncRNA–protein interaction datasets, LGFC-CNN achieves the best performance with an accuracy of 94.14%, on RPI21850; an accuracy of 92.94%, on RPI7317; and an accuracy of 98.19% on RPI1847. The results show that our LGFC-CNN can effectively predict the lncRNA–protein interactions by combining raw sequence composition features, hand-designed features and structure features.

Highlights

  • Long noncoding RNA is a type of noncoding RNA with at least 200 nucleotides that plays vital roles in many critical biological processes [1], such as cell differentiation, gene expression, and the display of developmental and tissue-specific expression patterns [2,3,4]

  • Four significant stages are involved in the development of LGFC-Convolutional neural network (CNN): (a) building the Long noncoding RNA (lncRNA)–protein-interaction datasets and obtaining the lncRNA and3porfo1-6 tein sequences; (b) feeding the lncRNA and protein hand-designed feature combinations into the random forest (RF) classifier to select those features of superior predictive effect; (c) preprocessing tchoemlnpcaRriNngALaGnFdCp-CroNteNinwsietqhuseenvceersalbeyxitswtiongmmetehthooddssa,nthdeorneseu-hltostsehnocwodthinagt LthGeFmC;-C(dN) Nobi-s taaicnoimngp,eatnitdivuenmifyetihnogdthfeordeimffeecntsivioenlys pofr,ebdyicFtionugrlinercRtrNanAs–foprrmot,etihneinlntecrRaNctAio'snsa.nd protein's secondary structures, hydrogen bonding propensities, and van der Waals interactions; a2n.dM(ea)tefreieadlsinagntdhMe geltohboadlsand local encoded sequences, hand-designed feature combinations Aanndilslutrsutrcatutiroanl ofefaLtuGrFeCs -iCntNoNthfeorCpNrNedimctoindgellntocRpNreAd–icptrtohteeinlnicnRteNrAac–tipornosteisinshinotwernacintFioignus.re 1

  • Six features of the lncRNA were combined with ten features of the protein and each feature combination was ranked according to their individual average performances in the random forest classifier

Read more

Summary

Introduction

Long noncoding RNA (lncRNA) is a type of noncoding RNA with at least 200 nucleotides that plays vital roles in many critical biological processes [1], such as cell differentiation, gene expression, and the display of developmental and tissue-specific expression patterns [2,3,4]. Some lncRNA participates in many complex human diseases by interacting with proteins [7]. Silencing lncRNA5657 inhibits the pneumonia lung inflammatory response via suppressing the expression of spinster homology protein, thereby reducing sepsis-induced lung injury [8]. LncRNA NEAT1 promotes MPTP-induced autophagy in Parkinson’s disease through the stabilization of the PINK1 protein [10]. Predicting potential lncRNA–protein interactions is a crucial step in understanding the function of lncRNA and creating the conditions for solving complex human diseases. With the development of experimental technology, computational methods have become crucial as a silver-bullet solution for the large-scale capture of lncRNA–protein interactions, which helps to prioritize lncRNA–protein interaction candidates and conduct further experimental verification

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.