Abstract

Protein-protein interactions (PPIs) play a very large part in most cellular processes. Although a great deal of research has been devoted to detecting PPIs through high-throughput technologies, these methods are clearly expensive and cumbersome. Compared with the traditional experimental methods, computational methods have attracted much attention because of their good performance in detecting PPIs. In our work, a novel computational method named as PCVM-LM is proposed which combines the probabilistic classification vector machine (PCVM) model and Legendre moments (LMs) to predict PPIs from amino acid sequences. The improvement mainly comes from using the LMs to extract discriminatory information embedded in the position-specific scoring matrix (PSSM) combined with the PCVM classifier to implement prediction. The proposed method was evaluated on Yeast and Helicobacter pylori datasets with five-fold cross-validation experiments. The experimental results show that the proposed method achieves high average accuracies of 96.37% and 93.48%, respectively, which are much better than other well-known methods. To further evaluate the proposed method, we also compared the proposed method with the state-of-the-art support vector machine (SVM) classifier and other existing methods on the same datasets. The comparison results clearly show that our method is better than the SVM-based method and other existing methods. The promising experimental results show the reliability and effectiveness of the proposed method, which can be a useful decision support tool for protein research.

Highlights

  • Proteins are a necessary component of the organism and are involved in almost all cellular activity in the organism

  • In order to comprehensively evaluate the performance of the proposed method, we evaluated the probabilistic classification vector machine (PCVM) classifier by comparing with the state-of-the-art support vector machine (SVM) classifier and other existing methods on the same dataset

  • To overcome to certain training points that belong to the negative class being assigned positive weights, this and vice problem, classifier proposed, which classifier gives different prior weights to different classesprior versa.the overcome this was problem, the PCVM

Read more

Summary

Introduction

Proteins are a necessary component of the organism and are involved in almost all cellular activity in the organism. A number of computational methods have been proposed to detect PPIs based on different data types, such as phylogenetic profiles [6,7], literature mining knowledge [8], gene neighborhood [9,10], gene fusion [11,12], and sequence conservation [13,14] These methods cannot work if the prior knowledge of a proteins is not available. A novel sequence-based computational method was proposed for identifying PPIs that combines the probabilistic classification vector machine (PCVM) model with a novel protein sequence feature extraction scheme. The comparison results show that our method outperforms SVM and other previous methods

Godden Standard Datasets
Position-Specific Scoring Matrix
Legendre Moments
Related
PCVMtoprovides methods:
PCVM Algorithm
Initial Parameter Selection and Training
Performance Evaluation
Assessment of Prediction
Comparison the Proposed Method with the SVM-Based Approach
Performance on Independent Dataset
Comparison with Other Methods
Proposed Method
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call