Abstract

Identifying protein-protein interactions (PPIs) plays a vital role in a number of biological activities such as signal transduction, transcriptional regulation, and apoptosis. Although advances in high-throughput technologies have generated large amounts of PPI data for different species, they only cover a small part of the entire PPI network. Furthermore, traditional experimental methods are generally expensive, time-consuming, tedious, and prone to high false-positive rates. Therefore, to overcome this problem, it is necessary to develop a novel computational method for predicting PPIs. In this article, we propose an efficient computational method to detect protein-protein interactions using only protein sequence information, which integrates the MatPCA feature extraction algorithm and the weighted sparse representation classifier. As a result, when predicting PPIs on yeast, human, and H. pylori datasets, the proposed method achieves superior prediction performance with an average accuracy of 94.55%, 97.48%, and 83.64%, respectively. These experimental results further illustrate that the proposed method is reliable and robust in predicting PPIs, which can be regarded as a useful complement to the experimental method.

Highlights

  • Proteins are an important part of all organisms and are one of the most versatile organic macromolecules in living systems

  • It is essential to employ computational methods to predict the interactions between protein pairs as this is important for explaining the molecular basis of complex cellular processes

  • A novel computational model using solely protein sequence information was proposed for protein-protein interaction prediction. e proposed model first transforms the original protein sequence into a substitution matrix representation

Read more

Summary

Introduction

Proteins are an important part of all organisms and are one of the most versatile organic macromolecules in living systems. Traditional experimental methods have achieved some results in detecting PPIs, they only account for a small part of the entire PPI network. These methods have weak generalization performance and high false-negative and false-positive rates, which are both costly and time-consuming [6, 7]. With the development of genomic technologies, the sequence data of proteins have shown explosive growth and are readily available Compared to these data types, researchers have developed many methods based on the amino acid sequence of proteins to infer potential PPIs. Experimental results confirm that using only protein amino acid sequences is feasible in predicting PPIs [15,16,17]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.