Abstract
Peptide-protein interactions between a smaller or disordered peptide stretch and a folded receptor make up a large part of all protein-protein interactions. A common approach for modeling such interactions is to exhaustively sample the conformational space by fast-Fourier-transform docking, and then refine a top percentage of decoys. Commonly, methods capable of ranking the decoys for selection fast enough for larger scale studies rely on first-principle energy terms such as electrostatics, Van der Waals forces, or on pre-calculated statistical potentials. We present InterPepRank for peptide-protein complex scoring and ranking. InterPepRank is a machine learning-based method which encodes the structure of the complex as a graph; with physical pairwise interactions as edges and evolutionary and sequence features as nodes. The graph network is trained to predict the LRMSD of decoys by using edge-conditioned graph convolutions on a large set of peptide-protein complex decoys. InterPepRank is tested on a massive independent test set with no targets sharing CATH annotation nor 30% sequence identity with any target in training or validation data. On this set, InterPepRank has a median AUC of 0.86 for finding coarse peptide-protein complexes with LRMSD < 4Å. This is an improvement compared to other state-of-the-art ranking methods that have a median AUC between 0.65 and 0.79. When included as a selection-method for selecting decoys for refinement in a previously established peptide docking pipeline, InterPepRank improves the number of medium and high quality models produced by 80% and 40%, respectively. The InterPepRank program as well as all scripts for reproducing and retraining it are available from: http://wallnerlab.org/InterPepRank .
Highlights
Interactions between a short stretch of amino acid residues and a larger protein receptor, referred to as peptide-protein interactions, make up approximately 15–40% of all inter-protein interactions (Petsalaki and Russell, 2008), and are involved in regulating vital biological processes (Midic et al, 2009; Tu et al, 2015)
In this work we have developed InterPepRank, a machine learning-based method which encodes the structure of a peptide-protein complex as a graph; with physical pairwise interactions as edges and residue information including evolutionary features such as PSSM and sequence conservation as nodes
InterPepRank averaged in an ensemble predictor, and the best ensemble used all networks except numbers 5 and 6
Summary
Interactions between a short stretch of amino acid residues and a larger protein receptor, referred to as peptide-protein interactions, make up approximately 15–40% of all inter-protein interactions (Petsalaki and Russell, 2008), and are involved in regulating vital biological processes (Midic et al, 2009; Tu et al, 2015). These short peptides have a high degree of conformational freedom and can be part of larger disordered regions (Neduva, Victor et al, 2005; Petsalaki and Russell, 2008), making them difficult to study experimentally. Template-based methods utilizing similarity to previously experimentally determined complexes, such as SPOT-Peptide (Litfin et al, 2019), GalaxyPepDock (Lee et al, 2015), and InterPep (Johansson-Åkhe et al, 2020a), have consistently shown high performance in previous benchmarks but are limited by available templates
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.