Abstract

Prediction of protein-protein interaction (PPI) remains a central task in systems biology. With more PPIs identified, forming PPI networks, it has become feasible and also imperative to study PPIs at the network level, such as evolutionary analysis of the networks, for better understanding of PPI networks and for more accurate prediction of pairwise PPIs by leveraging the information gained at the network level. In this work we developed a novel method that enables us to incorporate evolutionary information into geometric space to improve PPI prediction, which in turn can be used to select and evaluate various evolutionary models. The method is tested with cross-validation using human PPI network and yeast PPI network data. The results show that the accuracy of PPI prediction measured by ROC score is increased by up to 14.6%, as compared to a baseline without using evolutionary information. The results also indicate that our modified evolutionary model DANEOsf—combining a gene duplication/neofunctionalization model and scale-free model—has a better fitness and prediction efficacy for these two PPI networks. The improved PPI prediction performance may suggest that our DANEOsf evolutionary model can uncover the underlying evolutionary mechanism for these two PPI networks better than other tested models. Consequently, of particular importance is that our method offers an effective way to select evolutionary models that best capture the underlying evolutionary mechanisms, evaluating the fitness of evolutionary models from the perspective of PPI prediction on real PPI networks.

Highlights

  • With continuous efforts in identifying protein-protein interactions (PPIs) through both highthroughput wet-lab experiments and computational methods, an increasing number of new PPIs have been discovered and validated, enabling sizeable PPI networks to be formed

  • As mentioned in the method section, our method consists of two prediction stages: elementary prediction based on evolutionary analysis and final prediction based on Euclidean distance

  • To make a comprehensive comparison, we apply our evolutionary distance based embedding (EDE) algorithm to the three evolutionary distance matrices (DANEOsf, Linear Preference Attachment (LPA) and Random Mutation model (RM)) to compare their prediction efficacy; we use the minimum spanning tree based shortest path matrix (SP) as an input to the minimum curvilinear embedding algorithms Minimum Curvilinearity Embedding (MCE)-Multi Dimensional Scaling (MDS), MCE-SVD and non-centered MCE (ncMCE)-SVD (Multidimensional Scaling, Singular Value Decomposition and None-centered Singular Value Decomposition versions) proposed by Cannistraci et al [12, 17], and the embedding method proposed by Kuchaiev et al [16]

Read more

Summary

Introduction

With continuous efforts in identifying protein-protein interactions (PPIs) through both highthroughput wet-lab experiments and computational methods, an increasing number of new PPIs have been discovered and validated, enabling sizeable (even genome wide) PPI networks to be formed. In order to reveal the underlying evolutionary mechanism of PPI network, many evolutionary models, such as Duplication-Divergence model [2,3,4,5], Scale Free model [6] etc., have been proposed to simulate the evolutionary processes of PPI networks. For these different evolutionary models, there are still some controversies about fitting models to different species [7,8,9]. Evolutionary models are mainly used to explain how the networks evolve from an ancient version to what they currently are, by going back in time, namely “removing” edges from the current networks, it would be useful, probably even more so, to let us “add” edges, i.e., to make prediction of de novo interactions

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call