Abstract
BackgroundPlant long non-coding RNAs (lncRNAs) play vital roles in many biological processes mainly through interactions with RNA-binding protein (RBP). To understand the function of lncRNAs, a fundamental method is to identify which types of proteins interact with the lncRNAs. However, the models or rules of interactions are a major challenge when calculating and estimating the types of RBP.ResultsIn this study, we propose an ensemble deep learning model to predict plant lncRNA-protein interactions using stacked denoising autoencoder and convolutional neural network based on sequence and structural information, named PRPI-SC. PRPI-SC predicts interactions between lncRNAs and proteins based on the k-mer features of RNAs and proteins. Experiments proved good results on Arabidopsis thaliana and Zea mays datasets (ATH948 and ZEA22133). The accuracy rates of ATH948 and ZEA22133 datasets were 88.9% and 82.6%, respectively. PRPI-SC also performed well on some public RNA protein interaction datasets.ConclusionsPRPI-SC accurately predicts the interaction between plant lncRNA and protein, which plays a guiding role in studying the function and expression of plant lncRNA. At the same time, PRPI-SC has a strong generalization ability and good prediction effect for non-plant data.
Highlights
Plant long non-coding RNAs play vital roles in many biological processes mainly through interactions with RNA-binding protein (RBP)
We proposed a sequence- and structure-based ensemble model for predicting plant Long non-coding RNA (lncRNA)-protein interaction using stacked denoising autoencoder (SDAE) and convolutional neural network (CNN), named PRPI-SC
SDAE has strong noise reduction capabilities, which can effectively eliminate the interference from noise data, which is more common in plant datasets
Summary
Plant long non-coding RNAs (lncRNAs) play vital roles in many biological processes mainly through interactions with RNA-binding protein (RBP). LncRNA are non-protein coding transcripts and populous with the length of more than 200nt. They extensively exist in the nucleus or cytoplasm. Researchers have found that lncRNAs are involved in regulating multiple crucial biological processes by interacting with protein like chromatin-modified complexes and transcription factors [2,3,4]. Many key cellular processes, such as signal transduction, chromosome replication, Zhou et al BMC Bioinformatics (2021) 22:415 material transport, mitosis, transcription, and translation, are all linked to the interactions between lncRNAs and proteins [9,10,11]. Since the regulatory performance of lncRNAs requires the coordination of protein molecules, it is necessary to identify the interactions between lncRNAs and protein molecules
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.