Abstract
Prediction of protein-protein interaction (PPI) remains a central task in systems biology. With more PPIs identified, forming PPI networks, it has become feasible and also imperative to study PPIs at the network level, such as evolutionary analysis of the networks, for better understanding of PPI networks and for more accurate prediction of pairwise PPIs by leveraging the information gained at the network level. In this work we developed a novel method that enables us to incorporate evolutionary information into geometric space to improve PPI prediction, which in turn can be used to select and evaluate various evolutionary models. The method is tested with cross-validation using human PPI network and yeast PPI network data. The results show that the accuracy of PPI prediction measured by ROC score is increased by up to 14.6%, as compared to a baseline without using evolutionary information. The results also indicate that our modified evolutionary model DANEOsf—combining a gene duplication/neofunctionalization model and scale-free model—has a better fitness and prediction efficacy for these two PPI networks. The improved PPI prediction performance may suggest that our DANEOsf evolutionary model can uncover the underlying evolutionary mechanism for these two PPI networks better than other tested models. Consequently, of particular importance is that our method offers an effective way to select evolutionary models that best capture the underlying evolutionary mechanisms, evaluating the fitness of evolutionary models from the perspective of PPI prediction on real PPI networks.
Highlights
With continuous efforts in identifying protein-protein interactions (PPIs) through both highthroughput wet-lab experiments and computational methods, an increasing number of new PPIs have been discovered and validated, enabling sizeable PPI networks to be formed
As mentioned in the method section, our method consists of two prediction stages: elementary prediction based on evolutionary analysis and final prediction based on Euclidean distance
To make a comprehensive comparison, we apply our evolutionary distance based embedding (EDE) algorithm to the three evolutionary distance matrices (DANEOsf, Linear Preference Attachment (LPA) and Random Mutation model (RM)) to compare their prediction efficacy; we use the minimum spanning tree based shortest path matrix (SP) as an input to the minimum curvilinear embedding algorithms Minimum Curvilinearity Embedding (MCE)-Multi Dimensional Scaling (MDS), MCE-SVD and non-centered MCE (ncMCE)-SVD (Multidimensional Scaling, Singular Value Decomposition and None-centered Singular Value Decomposition versions) proposed by Cannistraci et al [12, 17], and the embedding method proposed by Kuchaiev et al [16]
Summary
With continuous efforts in identifying protein-protein interactions (PPIs) through both highthroughput wet-lab experiments and computational methods, an increasing number of new PPIs have been discovered and validated, enabling sizeable (even genome wide) PPI networks to be formed. In order to reveal the underlying evolutionary mechanism of PPI network, many evolutionary models, such as Duplication-Divergence model [2,3,4,5], Scale Free model [6] etc., have been proposed to simulate the evolutionary processes of PPI networks. For these different evolutionary models, there are still some controversies about fitting models to different species [7,8,9]. Evolutionary models are mainly used to explain how the networks evolve from an ancient version to what they currently are, by going back in time, namely “removing” edges from the current networks, it would be useful, probably even more so, to let us “add” edges, i.e., to make prediction of de novo interactions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.