Neural network extrapolation to distant regions of the protein fitness landscape

Chase R Freschlin,Sarah A Fahlberg,Pete Heinzelman,Philip A Romero

doi:10.1038/s41467-024-50712-3

Abstract

Machine learning (ML) has transformed protein engineering by constructing models of the underlying sequence-function landscape to accelerate the discovery of new biomolecules. ML-guided protein design requires models, trained on local sequence-function information, to accurately predict distant fitness peaks. In this work, we evaluate neural networks’ capacity to extrapolate beyond their training data. We perform model-guided design using a panel of neural network architectures trained on protein G (GB1)-Immunoglobulin G (IgG) binding data and experimentally test thousands of GB1 designs to systematically evaluate the models’ extrapolation. We find each model architecture infers markedly different landscapes from the same data, which give rise to unique design preferences. We find simpler models excel in local extrapolation to design high fitness proteins, while more sophisticated convolutional models can venture deep into sequence space to design proteins that fold but are no longer functional. We also find that implementing a simple ensemble of convolutional neural networks enables robust design of high-performing variants in the local landscape. Our findings highlight how each architecture’s inductive biases prime them to learn different aspects of the protein fitness landscape and how a simple ensembling approach makes protein engineering more robust.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature Communications	Publication Date: Jul 30, 2024
Citations: 1	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

Neural network extrapolation to distant regions of the protein fitness landscape

Abstract

Talk to us

Similar Papers

More From: Nature Communications

Lead the way for us

Similar Papers

Ensemble of PANORAMA-based convolutional neural networks for 3D model classification and retrieval
Konstantinos Sfikas ... Theoharis Theoharis
Computers & Graphics | VOL. 71
Konstantinos Sfikas, et. al.Konstantinos Sfikas ... Theoharis Theoharis
13 Dec 2017
Computers & Graphics | VOL. 71

Ensemble of Deep Convolutional Neural Networks for Automatic Pavement Crack Detection and Measurement
Zhun Fan ... Giuseppe Loprencipe
Coatings | VOL. 10
Zhun Fan, et. al.Zhun Fan ... Giuseppe Loprencipe
08 Feb 2020
Coatings | VOL. 10

Using Convolutional Neural Networks to Classify Audio Signal in Noisy Sound Scenes
M.V Gubin
-
M.V GubinM.V Gubin
01 Nov 2018
01 Nov 2018

An Ensemble of Fine-Tuned Convolutional Neural Networks for Medical Image Classification.
Ashnil Kumar ... David Lyndon
IEEE Journal of Biomedical and Health Informatics | VOL. 21
Ashnil Kumar, et. al.Ashnil Kumar ... David Lyndon
05 Dec 2016
IEEE Journal of Biomedical and Health Informatics | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Neural network extrapolation to distant regions of the protein fitness landscape

Abstract

Talk to us

Similar Papers

More From: Nature Communications