Abstract

Identification of protein–protein interactions (PPI) is among the critical problems in the domain of bioinformatics. Previous studies have utilized different AI-based models for PPI classification with advances in artificial intelligence (AI) techniques. The input to these models is the features extracted from different sources of protein information, mainly sequence-derived features. In this work, we present an AI-based PPI identification model utilizing a PPI network and protein sequences. The PPI network is represented as a graph where each node is a protein pair, and an edge is defined between two nodes if there exists a common protein between these nodes. Each node in a graph has a feature vector. In this work, we have used the language model to extract feature vectors directly from protein sequences. The feature vectors for protein in pairs are concatenated and used as a node feature vector of a PPI network graph. Finally, we have used the Graph-BERT model to encode the PPI network graph with sequence-based features and learn the hidden representation of the feature vector for each node. The next step involves feeding the learned representations of nodes to the fully connected layer, the output of which is fed into the softmax layer to classify the protein interactions. To assess the efficacy of the proposed PPI model, we have performed experiments on several PPI datasets. The experimental results demonstrate that the proposed approach surpasses the existing PPI works and designed baselines in classifying PPI.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.