Abstract

Understanding the relationships between protein sequence and function is critical for rational protein engineering. We have developed a deep learning model based on a variational autoencoders (VAE) to perform data-driven learning of these relationships within homologous protein families. In an application to the C-terminal SH3 domain of the high osmolarity signaling protein Sho1 (Sho1-SH3) in budding yeast, our trained VAE latent space accurately discriminates Sho1-SH3 orthologs and paralogs and enables the generative design of novel synthetic proteins predicted to possess high osmosensing function. Experimental gene synthesis and in vivo functional assays by a quantitative next-generation sequencing-based selective assay demonstrates the capability of the model to design more than 500 novel synthetic proteins possessing similar or superior function to natural sequences while sharing at minimum 60% sequence similarity with any known natural SH3 domains. Our results demonstrate the potential of deep learning models to learn the rules of protein design from extant natural proteins and use these rules to rationally design novel synthetic proteins with high functionality.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.