Abstract

Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/.

Highlights

  • Proteins are fundamental molecules of living organisms that participate in most cell functions in an organism

  • The experimental results of the prediction models of the Relevance Vector Machine (RVM) classifier combined with Average Blocks and the Position Specific Scoring Matrix and principal component analysis based on the information of protein sequence on yeast and Helicobacter pylori datasets are listed in Tables 1 and 2

  • Average Blocks method based on the residue conservation tendencies in the same domain family are similar and the locations of domains in the same family are closely related to the length of the sequence, information can be effectively captured from the Position Specific Scoring Matrix (PSSM) using the Average Blocks method; while meeting the condition of maintaining the integrity of the information in the PSSM, we reduced the dimensions of each AB vector and reduced the influence of noise using principal component analysis

Read more

Summary

Introduction

Proteins are fundamental molecules of living organisms that participate in most cell functions in an organism. Protein-protein interactions (PPIs) play an essential role in many biological processes. Detection the interactions of proteins become more and more important. A number of high-throughput technologies, such as immunoprecipitation [1], protein chips [2], and yeast two-hybrid screening methods [3,4], have been developed for detecting the large-scale PPIs. there are some disadvantages of these experimental approaches, such as time-intensiveness and high cost. The aforementioned methods suffer from high rates of false positives and false negatives. For these reasons, predicting unknown PPIs is considered a difficult task using only biological experimental methods. There is a stronger motivation to exploit computational methods for PPIs

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.