Abstract

Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model-despite weaknesses including a noisy data set-can be used to substantially increase the stability of both expert-designed and model-generated proteins.

Highlights

  • Most proteins, natural or designed, require a stable tertiary structure for functions such as binding [1], catalysis [2], or self-assembly [3]

  • We describe another neural network, the Generator Model (GM), that is able to create novel stable protein sequences at high speed; these sequences can be successfully refined by the Evaluator Model (EM)

  • We demonstrate that a convolutional neural network model can predict the stability—measured using a yeast display assay that permits evaluation of 100,000 proteins per experiment—of novel mini-proteins given only primary sequences as input

Read more

Summary

Introduction

Natural or designed, require a stable tertiary structure for functions such as binding [1], catalysis [2], or self-assembly [3]. We demonstrate the use of the EM to refine the stability of protein designs by making multiple changes, increasing stability tenfold as evaluated by the assay described in [26] We show that these refinements can be made to respect additional constraints on how they change the proteins, which in conjunction with other tools could lead to preservation of structure or function. We describe another neural network, the Generator Model (GM), that is able to create novel stable protein sequences at high speed; these sequences can be successfully refined by the EM. We demonstrate via low-resolution methods that selected examples fold into stable structures, and report a high resolution crystal structure of one design that matches the expected topology

Results
Discussion
Materials and methods
À1 0 À1 0
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call