Abstract
Predicting residue‐residue distance relationships (eg, contacts) has become the key direction to advance protein structure prediction since 2014 CASP11 experiment, while deep learning has revolutionized the technology for contact and distance distribution prediction since its debut in 2012 CASP10 experiment. During 2018 CASP13 experiment, we enhanced our MULTICOM protein structure prediction system with three major components: contact distance prediction based on deep convolutional neural networks, distance‐driven template‐free (ab initio) modeling, and protein model ranking empowered by deep learning and contact prediction. Our experiment demonstrates that contact distance prediction and deep learning methods are the key reasons that MULTICOM was ranked 3rd out of all 98 predictors in both template‐free and template‐based structure modeling in CASP13. Deep convolutional neural network can utilize global information in pairwise residue‐residue features such as coevolution scores to substantially improve contact distance prediction, which played a decisive role in correctly folding some free modeling and hard template‐based modeling targets. Deep learning also successfully integrated one‐dimensional structural features, two‐dimensional contact information, and three‐dimensional structural quality scores to improve protein model quality assessment, where the contact prediction was demonstrated to consistently enhance ranking of protein models for the first time. The success of MULTICOM system clearly shows that protein contact distance prediction and model selection driven by deep learning holds the key of solving protein structure prediction problem. However, there are still challenges in accurately predicting protein contact distance when there are few homologous sequences, folding proteins from noisy contact distances, and ranking models of hard targets.
Highlights
The improved contact prediction led to the significant improvement of template-free modeling in CASP12 experiment, in which contact predictions were used with different ab initio modeling methods such as fragment assembly and distance geometry to build protein structural models from scratch 1
To prepare for 2018 CASP13 experiment, we focused on enhancing our MULTICOM protein structure prediction system 17-19 with our latest development in contact distance prediction empowered by deep learning and its application to template-free modeling and protein model ranking 17, 20-22, while having a routine update on its other components such as template library, template identification, and templatebased modeling
Our experiment demonstrates that contact distance prediction empowered by the advanced deep learning architecture can accurately predict a large number of contacts for some templatefree or hard template-based targets, which are sufficient to fold them correctly by the distance geometry and simulated annealing from scratch without using any template or fragment information
Summary
The improved contact prediction led to the significant improvement of template-free modeling in CASP12 experiment, in which contact predictions were used with different ab initio modeling methods such as fragment assembly and distance geometry to build protein structural models from scratch 1. Our experiment demonstrates that contact distance prediction empowered by the advanced deep learning architecture can accurately predict a large number of contacts for some templatefree or hard template-based targets, which are sufficient to fold them correctly by the distance geometry and simulated annealing from scratch without using any template or fragment information. Contact distance prediction and deep learning are the key driving force that made our MULTICOM predictor rank third in the CASP13 experiment in both template-based and template-free modeling. Materials and Method we first provide an overview of the MULTICOM server and human prediction system, followed with the detailed description of several key new components that we added into the MULTICOM system in CASP13, such as the protein contact distance prediction empowered by deep learning, ab initio protein structure prediction driven by predicted contact distances, and large-scale protein quality assessment enhanced by deep learning and contacts
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.