EvSeq: Cost-Effective Amplicon Sequencing of Every Variant in a Protein Library.

Bruce J Wittmann,Patrick J Almhjell,Kadina E Johnston,Frances H Arnold

doi:10.1021/acssynbio.1c00592

Bruce J Wittmann, Patrick J Almhjell + Show 2 more

Open Access

https://doi.org/10.1021/acssynbio.1c00592

Copy DOI

Journal: ACS synthetic biology	Publication Date: Feb 17, 2022
Citations: 22	License type: cc-by-nc-nd

Affiliation: California Institute of Technology

Abstract

Widespread availability of protein sequence-fitness data would revolutionize both our biochemical understanding of proteins and our ability to engineer them. Unfortunately, even though thousands of protein variants are generated and evaluated for fitness during a typical protein engineering campaign, most are never sequenced, leaving a wealth of potential sequence-fitness information untapped. Primarily, this is because sequencing is unnecessary for many protein engineering strategies; the added cost and effort of sequencing are thus unjustified. It also results from the fact that, even though many lower-cost sequencing strategies have been developed, they often require at least some access to and experience with sequencing or computational resources, both of which can be barriers to access. Here, we present every variant sequencing (evSeq), a method and collection of tools/standardized components for sequencing a variable region within every variant gene produced during a protein engineering campaign at a cost of cents per variant. evSeq was designed to democratize low-cost sequencing for protein engineers and, indeed, anyone interested in engineering biological systems. Execution of its wet-lab component is simple, requires no sequencing experience to perform, relies only on resources and services typically available to biology labs, and slots neatly into existing protein engineering workflows. Analysis of evSeq data is likewise made simple by its accompanying software (found at github.com/fhalab/evSeq, documentation at fhalab.github.io/evSeq), which can be run on a personal laptop and was designed to be accessible to users with no computational experience. Low-cost and easy-to-use, evSeq makes the collection of extensive protein variant sequence-fitness data practical.

Full Text