Abstract

Predicting the three-dimensional structure of protein directly from its amino acid sequence remains one of the most challenging problems in computational biology. A relaxed version of the problem is when some contacts between protein residues are known and used as constraints in the energy minimization process of protein folding. In recent work, protein contacts prediction methods showed notable progress, in particular by the use of novel deep learning algorithms. In all of the current frameworks, though, a significant number of sequence homologs are needed as an input to the neural networks, for the contact prediction algorithm to provide enough long-range constraints to the fold the protein correctly. The latter renders it impossible to use these methods to fold proteins where there are very few or low-variance homologs found in protein sequence databases. We overcome this limitation and introduce a novel deep learning method to reconstruct the protein contact map from an amino-acid sequence, without the need for homology. To achieve that, we begin by decomposing the entire collection of known protein sequences and structures into a set of hierarchical sequence and structural motifs, respectively. We subsequently use the extracted motif representation to embed the sequences and structures into latent spaces and train a meta-model to associate a sequence embedding with a structural embedding. Finally, we deconvolve the structural embedding into a two-dimensional contact map and proceed with contact-assisted protein folding. In this work, we build on top of our sequence embedding framework, CoMET, developing the structural embedding pipeline and train the end-to-end framework to characterize the fidelity of the predicted contact maps for protein folding.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.