Abstract

Predicting the structures of proteins from amino acid sequences is of great importance. Recently, the accuracy of de novo protein structure prediction has been substantially improved when assisted by information about the contact between residues, which is also predictable from the sequence. Here, we present a novel pipeline for rapid protein structure prediction, which consists of a residue contact predictor, AmoebaContact, and a contact-assisted folder, GDFold. Unlike mainstream contact predictors that utilize simple, regularized neural networks, AmoebaContact adopts a set of network architectures that are optimized for contact prediction through automatic searching, and it predicts contacts at a series of cutoffs. Unlike conventional contact-assisted folders that only use top-scored contact pairs, GDFold considers all residue pairs from the prediction results of AmoebaContact in a differentiable loss function and optimizes atom coordinates using the gradient descent algorithm. The combination of AmoebaContact and GDFold allows quick modelling of the protein structure with acceptable model quality. Predicting the structure of proteins from amino acid sequences is a hard problem. Convolutional neural networks can learn to predict a map of distances between amino acid residues that can be turned into a three-dimensional structure. With a combination of approaches, including an evolutionary technique to find the best neural network architecture and a tool to find the atom coordinates in the folded structure, a pipeline for rapid prediction of three-dimensional protein structures is demonstrated.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call