Abstract

Predicting protein-protein interactions has become a key step of reverse-engineering biological networks to better understand cellular functions. The experimental methods in determining protein-protein interactions are time-consuming and costly, which has motivated vigorous development of computational approaches for predicting protein-protein interactions. A set of recently developed bioinformatics methods utilizes coevolutionary information of the interacting partners (e.g., as exhibited in the form of correlations between distance matrices, where, for each protein, a matrix stores the pairwise distances between the protein and its orthologs in a group of reference genomes). We proposed a novel method to account for the intra-matrix correlations in improving predictive accuracy. The distance matrices for a pair of proteins are transformed and concatenated into a phylogenetic vector. A least-squares support vector machine is trained and tested on pairs of proteins, represented as phylogenetic vectors, whose interactions are known. The intra-matrix correlations are accounted for by introducing a weighted linear kernel, which determines the dot product of two phylogenetic vectors. The performance, measured as receiver operator characteristic (ROC) score in cross-validation experiments, shows significant improvement of our method (ROC score 0.928) over that obtained by Pearson correlations (0.659).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call