Abstract

The recognition of protein-protein interaction sites (PPIs) is beneficial for the interpretation of protein functions and the development of new drugs. Traditional biological experiments to identify PPI sites are expensive and inefficient, leading to the generation of various computational methods to predict PPIs. However, the accurate prediction of PPI sites remains a big challenge due to the existence of the sample imbalance issue. In this work, we design a novel model that combines convolutional neural networks (CNNs) with Batch Normalization to predict PPI sites, and employ an oversampling technique Borderline-SMOTE to address the sample imbalance issue. In particular, to better characterize the amino acid residues on the protein chains, we employ a sliding window approach for feature extraction of target residues and their contextual residues. We verify the effectiveness of our method by comparing our method with the existing state-of-the-art schemes. The performance validations of our method on three public datasets achieve accuracies of 88.6%, 89.9%, and 86.7%, respectively, all showing improved accuracies compared with the existing schemes. Moreover, the ablation experiment results suggest that Batch Normalization can greatly improve the generalization and the prediction stability of our model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call