Abstract
Self-interacting proteins (SIPs), whose more than two identities can interact with each other, play significant roles in the understanding of cellular process and cell functions. Although a number of experimental methods have been designed to detect the SIPs, they remain to be extremely time-consuming, expensive, and challenging even nowadays. Therefore, there is an urgent need to develop the computational methods for predicting SIPs. In this study, we propose a deep forest based predictor for accurate prediction of SIPs using protein sequence information. More specifically, a novel feature representation method, which integrate position-specific scoring matrix (PSSM) with wavelet transform, is introduced. To evaluate the performance of the proposed method, cross-validation tests are performed on two widely used benchmark datasets. The experimental results show that the proposed model achieved high accuracies of 95.43 and 93.65% on human and yeast datasets, respectively. The AUC value for evaluating the performance of the proposed method was also reported. The AUC value for yeast and human datasets are 0.9203 and 0.9586, respectively. To further show the advantage of the proposed method, it is compared with several existing methods. The results demonstrate that the proposed model is better than other SIPs prediction methods. This work can offer an effective architecture to biologists in detecting new SIPs.
Highlights
Proteins, highly complex substance, are the main compound of all the life
There are a great deal of computational techniques based on machine learning and deep learning (Gui et al, 2009; You et al, 2010b, 2015a, 2017a,b; Lu et al, 2013; Mi et al, 2013; Huang et al, 2015; Chen et al, 2016, 2018a,b,c; Gui et al, 2016; Huang et al, 2016b; Li et al, 2018) which applied in the field of bioinformatics and genomics, in which they were few for detecting protein interactions
In this study we presented a novel approach for self-interacting proteins (SIPs) prediction, which combined deep forest with wavelet transform (WT) method based on position-specific scoring matrix (PSSM) of protein sequences
Summary
Highly complex substance, are the main compound of all the life. It is the material basis and the first element of the life. Most of proteins can work together with molecular partners or other proteins, which are associated with proteinprotein interactions (PPIs) (Chou and Cai, 2006; You et al, 2014b,c; Li et al, 2017). Deep Forest for Predicting SIPs key roles in the understanding of celluar process and cell functions. These interactions have received much more attention than they have done in recent years. Most previous works focus on the individual SIPs with the level of structures and functions. There are a great deal of computational techniques based on machine learning and deep learning (Gui et al, 2009; You et al, 2010b, 2015a, 2017a,b; Lu et al, 2013; Mi et al, 2013; Huang et al, 2015; Chen et al, 2016, 2018a,b,c; Gui et al, 2016; Huang et al, 2016b; Li et al, 2018) which applied in the field of bioinformatics and genomics, in which they were few for detecting protein interactions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.