Protein–protein interaction (PPI) prediction is one of the main goals in the current Proteomics. This work presents a method for prediction of protein–protein interactions through a classification technique known as support vector machines. The dataset considered is a set of positive and negative examples taken from a high reliability source, from which we extracted a set of genomic features, proposing a similarity measure. From this dataset we extracted 26 proteomics/genomics features using well-known databases and datasets. Feature selection was performed to obtain the most relevant variables through a modified method derived from other feature selection methods for classification. Using the selected subset of features, we constructed a support vector classifier that obtains values of specificity and sensitivity higher than 90% in prediction of PPIs, and also providing a confidence score in interaction prediction of each pair of proteins.
Read full abstract