Abstract
Obtaining an accurate description of protein structure is a fundamental step toward understanding the underpinning of biology. Although recent advances in experimental approaches have greatly enhanced our capabilities to experimentally determine protein structures, the gap between the number of protein sequences and known protein structures is ever increasing. Computational protein structure prediction is one of the ways to fill this gap. Recently, the protein structure prediction field has witnessed a lot of advances due to Deep Learning (DL)-based approaches as evidenced by the success of AlphaFold2 in the most recent Critical Assessment of protein Structure Prediction (CASP14). In this article, we highlight important milestones and progresses in the field of protein structure prediction due to DL-based methods as observed in CASP experiments. We describe advances in various steps of protein structure prediction pipeline viz. protein contact map prediction, protein distogram prediction, protein real-valued distance prediction, and Quality Assessment/refinement. We also highlight some end-to-end DL-based approaches for protein structure prediction approaches. Additionally, as there have been some recent DL-based advances in protein structure determination using Cryo-Electron (Cryo-EM) microscopy based, we also highlight some of the important progress in the field. Finally, we provide an outlook and possible future research directions for DL-based approaches in the protein structure prediction arena.
Highlights
Three sets of coevolutionary features viz. covariance features (COV), precision matrix features (PRE), and a coupling parameter matrix approximated by pseudolikelihood maximization (PLM) are extracted from the deep multiple sequence alignments (MSAs) created in step 1— the name TripletRes
As in previous iterations of I-TASSER, the C-ITASSER pipeline consists of the following steps. (a) Given a protein sequence, the sequence is threaded using LOMETS, and at the same time, MSA is generated using DeepMSA. (b) Template fragments are created from the threading templates, which are subjected to structure assembly using Replica-Exchange Monte Carlo (REMC) guided by the potential calculated from the improved contact map created using NeBcon
We are at an exciting era in terms of protein structure prediction approaches especially due to the advancement in the field made possible by using Deep Learning
Summary
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. After the CASP13 competition, it was evident that evolutionary information captured in multiple sequence alignments (MSAs) was the most important input for structure prediction, and all research groups have their own in-house methods for generating MSAs. the problem of generating high-quality alignments, for difficult sequences, stands as a huge challenge. There is no comprehensive review that focuses on the DL-based advances in various steps of the protein structure prediction pipeline. We highlight DL-based advances in each step of the protein structure prediction pipeline viz. Advances in MSA generation, contact map prediction, protein residue–distance prediction, potentials to guide iterative fragment assembly, models, or quality assessment (QA), advances in overall protein prediction pipelines, and advances in Cryo-EM based protein structure determination and the future outlook for the protein structure prediction field We highlight DL-based advances in each step of the protein structure prediction pipeline viz. advances in MSA generation, contact map prediction, protein residue–distance prediction, potentials to guide iterative fragment assembly, models, or quality assessment (QA), advances in overall protein prediction pipelines, and advances in Cryo-EM based protein structure determination and the future outlook for the protein structure prediction field
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have