Protein sequence design with a learned potential

Namrata Anand,Carla P Perez,Po-Ssu Huang,Russ B Altman,Alexander Derry,Raphael Eguchi,Irimpan I Mathews

doi:10.1038/s41467-022-28313-9

Namrata Anand, Carla P Perez + Show 5 more

Open Access

https://doi.org/10.1038/s41467-022-28313-9

Copy DOI

Abstract

The task of protein sequence design is central to nearly all rational protein engineering problems, and enormous effort has gone into the development of energy functions to guide design. Here, we investigate the capability of a deep neural network model to automate design of sequences onto protein backbones, having learned directly from crystal structure data and without any human-specified priors. The model generalizes to native topologies not seen during training, producing experimentally stable designs. We evaluate the generalizability of our method to a de novo TIM-barrel scaffold. The model produces novel sequences, and high-resolution crystal structures of two designs show excellent agreement with in silico models. Our findings demonstrate the tractability of an entirely learned method for protein sequence design.

Highlights

The task of protein sequence design is central to most rational protein engineering problems, and enormous effort has gone into the development of energy functions to guide design
The backbone is fully specified by the positions of each residue’s four N − Cα − C − O atoms and the C-terminal oxygen atom, whose positions are encoded as X 2 Rð4nþ1Þ 3; the final conditional distribution we are interested in modeling is: PðYjXÞ 1⁄4 pðyi1⁄41; 1⁄4 ; ynjXÞ
circular dichroism (CD) spectra for the top model designs match the native spectra well, and the designs were found to be more thermally stable than the native as well (Fig. 2J and Supplementary Fig. 13). These results indicate that the neural network model generalizes to topologies that are strictly unseen by the model during training

Summary

Introduction

The task of protein sequence design is central to most rational protein engineering problems, and enormous effort has gone into the development of energy functions to guide design. 1234567890():,; Computational protein design has emerged as a powerful tool for rational protein design, enabling significant achievements in the engineering of therapeutics[1,2,3], biosensors[4,5,6], enzymes[7,8], and more[9,10,11] Key to such successes is robust sequence design methods that minimize the folded-state energy of a pre-specified backbone conformation, which can either be derived from existing structures or generated de novo. We explore an approach for sequence design guided only by a neural network that explicitly models side-chain conformers in a structure-based context (Fig. 1A), and we assess its generalization to unseen native topologies and to a de novo TIM-barrel protein backbone. The model produces novel sequences, and the high-resolution crystal structures of two designs show excellent agreement with in silico models

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature communications	Publication Date: Feb 8, 2022
Citations: 116	License type: open-access

R Discovery Prime

R Discovery Prime

Protein sequence design with a learned potential

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature communications

Lead the way for us

Similar Papers

Accurate and efficient protein sequence design through learning concise local environment of residues.
Bin Huang ... Jian Han
Bioinformatics (Oxford, England) | VOL. 39
Bin Huang, et. al.Bin Huang ... Jian Han
01 Mar 2023
Bioinformatics (Oxford, England) | VOL. 39

SPDesign: protein sequence designer based on structural sequence profile using ultrafast shape recognition.
Hui Wang ... Yajun Wang
Briefings in Bioinformatics | VOL. 25
Hui Wang, et. al.Hui Wang ... Yajun Wang
27 Mar 2024
Briefings in Bioinformatics | VOL. 25

ECrystals: a route for open access to small molecule crystal structure data
S Coles ... M Hursthouse
Acta Crystallographica Section A Foundations of Crystallography | VOL. 62
S Coles, et. al.S Coles ... M Hursthouse
06 Aug 2006
Acta Crystallographica Section A Foundations of Crystallography | VOL. 62

‘The eCrystals Federation’ management and publication of small molecule structure data for the whole crystallographic community
...
-
, et. al. ...
09 May 2013
09 May 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Protein sequence design with a learned potential

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature communications